


RATAN TATA LIBRARY 

Cl. No. I - ■ / / /' 

Ac. No. I y } 

This book should be returned on or before the dale last stamped 
below. An overdue charge of 10 P. will be collected for each 
day the book is kept overtime. 





FINITE DIMENSIONAL 
VECTOR SPACES 


By 

PAUL R. HALMOS 


PRINCETON 

PRINCETON UNIVERSITY PRESS 

LONDON HIIMPHRF.Y MILtoRI) 

OXFORD UNIVKRSITV PRESS 

1948 



Copyright 1941 
PuMcrroN Unitbuitt Pbbm 
Copyright 1947 
Paul R. Halmos 


Uthoprinted la U3.A. 

EDWARDS BROTHERS. INC. 

ANN ARBOB, MICHIGAN 

1949 



PREFACE 


That Hilbert apace theoiy and elementary matrix 
theory are Intimately aaaoclated came as a surprise to 
me and to many colleagues of my generation only after 
studying the two subjects separately. This la deplorable: 
It took us as much time to discover for ourselves that 
there Is a connection as It took to learn the two seem¬ 
ingly separate disciplines. I present this little book 
In an attempt to remedy the situation. Addressing the 
advanced undergraduate or beginning graduate student, I 
treat linear transformations on finite dimensional vec¬ 
tor spaces by the methods of more general theories. My 
purpose Is to emphasize the simple geometric notions 
common to many parts of mathematics and Its applications, 
and to do this In a language which gives away the trade 
secrets and tells the student what Is In the back of the 
minds of people proving theorems about Integral equa¬ 
tions and Banach sj^ces. The reader does not, however, 
have to share ray prejudiced motivation. Except for an 
occasional reference to undergraduate mathematics the 
book Is self contained and may be read by anyone who Is 
trying to get a feeling for the linear problems usually 
discussed In courses on "matrix theory" or "hl^er al¬ 
gebra". The algebraic, coordinate - free, methods do 
not lose power and elegance by specialization to a finite 
number of dimensions, and are. In my belief, as elemen¬ 
tary as the classical coordlnatlzed treatment. 

I originally Intended this book to contain a theorem 
If and only If an Infinite dimensional generalization of 
it already exists. Barring a few concessions to the 
tempting easiness of some essentially finite dimensional 

I 



II 


PREFACE 


notions and results, I have followed this plan. My em¬ 
phasis, however, is more on method than on results. The 
reader may sometimes see some obvious way of shortening 
the proofs I give. (He is, for example, very likely to 
do this in connection with the representation of a 
linear functional by an inner product or the treatment 
of direct products of unitary spaces.) The chances ai*e 
that the infinite dimensional analog of the shorter 
proof is either much longer or else non existent. 

To supplement the hints in the body of the book con¬ 
cerning the various directions in which a student may 
proceed, I have appended a bibliography. This very 
short list makes no pretense to completeness; it consists 
merely of the books which have helped me the most. 

Their perusal should give the student an idea of most of 
the Important extensions of the subjects I treat. 

In conclusion I want to express ray really sincere 
thanks to virtually every mathematician in Princeton. 

Most of them have read parts of the manuscript, dis¬ 
cussed the project with me, and were very kind in giving 
encouragement and criticism. I am particularly grateful 
to two men: John von Neumann, who is one of the origi¬ 
nators of the modem spirit and methods which I have 
tried to present and whose teaching was the inspiration 
for this book, and J. L. Doob, who read the entire manu¬ 
script and made many valuable suggestions. 


Paul R. Halmos 

The Institute for Advanced Study, 

Princeton, New Jersey 



TABLE OP CONTENTS 


Page 

Chapter I. SPACES 

§ 1 . Definition of vector space .. 1 

52 . Exan^jles of vector spaces . 2 

§3. Comments on notation and terminology .... 4 

§ 4 . Definition of linear dependence . 5 

§ 5 . Characterization of linear dependence ... 7 

56 . Definition and construction of bases .... 8 

57 . Dimension of a vector space .lo 

58 . Isomorphism of vector spaces . 12 

59 . Linear manifolds . 14 

510 . Calculus of linear manifolds.15 

511. Dimension of a linear manifold.17 

512 . Conjugate spaces . 18 

515. Notation for linear functionals .20 

51 4 . Bases In conjugate spaces . 21 

515. Reflexlvlty of finite dimensional spaces . . 23 

516. Annlhllators of linear manifolds.25 

Si 7. Direct simis .27 

518. Dimension of a direct sum.29 

519. Conjugate spaces of direct svims.31 

Chapter II. TRANSFORMATIONS 

520. Definitions and examples of linear trans¬ 
formations . 33 

521. Linear transfonnatlons as a vector space . . 35 

522. Products of linear transfonnatlons . 36 

523. Polynomials in a linear transfoimatlon ... 37 

524 . Inverse of a linear transformation.39 

525 . Definition of matrices. 42 

526. Isomorxdilsm between matrices and operators . 46 

III 





















IV 


TABLE OP CONTENTS 


Page 

§a 7 . ReduclMllty. 48 

§a8. Complete reduclblllty and direct sims of 

trensfonnatlons.$0 

§ 29 . Projections .51 

§30. Algebraic combinations of projections ... 53 
§31. Application to reduclblllty and 

Involutions.36 

v^ 32 . Adjoint operators.58 

^33. Adjoint of a projection.60 

§34. Change of basis.63 

§35. Linear transfozviatlons under a change 

of basis .66 

§36. Range and null space of a linear 

transfonnatlon.69 

§ 37 . Rank and nullity.71 

§38. Linear transformations of rank one.74 

§39. Determinants and the spectral terminology. . 77 
§ 4 o. Multiplicities; the trace of a linear 

transformation.80 

§ 41 . Super diagonal form.83 

Chapter III. ORTHOGONALITY 

§ 42 . Concept of an Inner product.86 

§43. Generalization to complex spaces.88 

§ 44 . Formal definition of unitary space.90 

§45. Applications of Schwarz's Ineqiiallty .... 91 

§ 46 . Orthogonality .93 

§47. Characterizations of coinpleteness.95 

§ 48 . Existence of coiiQ)lete orthonormal sets ... 97 

§49. Projection theorem .99 

§50. Representation of linear ftinctlonals . . . .100 
§51. Relation between parantheses and brackets .102 
§ 52 . Comparison of the two "natriral” Iso- 

moridilsms from 73 to 73 **. 104 

§55. Linear transformations on a unltar<y space .105 
§54. Hermltlan transformations .107 

























TABLE OP CONTENTS 


V 


Page 

§55. Algebraic comblnatlona of Hennltlan 

transfonnatlona.109 

§56. Non negative transfonnatlona.Ill 

§57. Perpendicular projections.112 

§58. Algebraic comblnatlona of perpendicular 

projections .115 

§59. Unitary transformations.117 

§60. ChEuige of orthogonal basis.119 

§61. Cayley tranafonn .121 

§62. Proper values of Hennltlan and unitary 

transfonnatlona .124 

§63. Spectral theorem for Heimltlan trans¬ 
formations .126 

§ 64 , Nonnal transformations .130 

§65. Functions of normal transfonnatlona .... 134 
§66. Properties of non negative transformations . 135 

§67. Polar decomposition.138 

§ 68 . Problems of commutativity.i 4 o 

§69. Hennltlan transformations of rank one ... 142 

§70. Convergence of vectors . 144 

§71. Bound of a linear transfonnatlon .145 

§72. Expressions for the bound.. . 147 

§73. Bounds of a Hennltlan transformation .... 149 

§74. Mlnlmax principle .’51 

§75. Convergence of linear transformations ... 152 
§76. Ergodlc theorem for vuiltary transformations. 154 
§77. Power series .153 

Appendix I. THE CLASSICAL CANONICAL FORM .... 159 

Appendix II. DIRECT PRODUCTS .170 

Appendix III. HILBERT SPACE .183 

Bibliography .189 

List of Notations.. 

Index of Definitions .. 

























ERRATA 

p. sk, line 11; Instead of x^Cy) read Zq(j). 
p. s 6 , line 13: Instead of a • IS read a • S . 


p. line 18 : Instead of S] read S 


p. 55 > line 18 ; Instead of 371 , o 37 lg read 371 , 


n 31 


2 * 


p. line 1: Instead of (AB*B~^) read p(AB*B"’). 
p. 97 t line 95 Instead of ( 7 ) read (1). 
p. 122, last line: Instead of (A + 11 ) read (A -i- 11 )y. 
p. 1 ^ 7 : The first two sentences of the parsgraidi begin¬ 
ning near the bottom of the page should read as follows: 
Using the above characterization of non negativeness, the 


[i f] 


reader may verify that If A 

and If C la a Bermltlan matrix for which both A ^ C and 
B ^ C, then 

1 + € 9 

9 1 + « 


where € ^ 0, d ^ 0, and |e|^ ^ min |€(1 + 6 ), <J(1 + €)|. 
It Is also easy to show tliat, for a matrix of the type of 
C, C ^ 1 can hold If and only If C - 1. 


19^8 reprinting 



C!hapter I 


SPACES 

§ 1 . 

In what follows we shall have occasion to use dif¬ 
ferent classes of numbers (such as the class of all re^l 
numbers or the class of all complex nimibera). Because 
we don't want, at this early stage, to commit ovurselvea 
to any specific class we shall adopt the dodge of refer¬ 
ring to nvimbers as scalars « The reader will not lose 
anything essential if he consistently Interprets scalars 
as real numbers or as complex nimibers: In the examples 
that we shall give both classes will occiar. 

DEFINITION. A vector space . D , Is a 
set of elements x, y, z, etc., called vectors . 
satisfying the following axioms. 

A. 

To every pair x and y, of vectors In 
U there corresponds a vector z, called the 
sum of X and y, z = x+y. In such a way that 

( 1 ) addition Is commutative, x+y *• y+x; 

( 2 ) addition Is associative, x+(y+z) » 

(x+y) + z; 

^ ( 3 ) there exists In B a unique vector, 

0 , (called the origin ) such that for all x In 
B , x+o - x; and 

( 4 ) to every x in B there corresponds 
a unique vector, denoted by -x, with the proper- 


1 



s 


I. SPACES 


ty X + (-x) - 0 . 

B. 

To every pair, a and x, where a Is a 
scalar and x Is a vector In X) , there cor¬ 
responds a vector y in B, called the pro ¬ 
duct of a and x, y » ax, such that 

( 1 ) multiplication la distributive with 
resi>ect to vector addition, o(x+y) - ax +ay; 

( 2 ) multiplication la distributive with 
respect to scalar addition, (a + (^x;» ax 4 -fix 

( 3 ) multiplication Is associative, 
affix) ■» (a(i)x; and 

( 4 ) ox - 0 , IX - X. 

(These axioms are not logically Independent: they 
are merely a convenient characterization of the objects 
we wish to study.) According as scalars are Intei^jreted 
as real or complex numbers we shall refer to real or com¬ 
plex vector spaces. 

§ 2 . EXAMPLES OP VECTOR SPACES 

Before discussing the Implications of these axioms 
we give some examples. We shall refer to these exasples 
over and over again and we shall use the notation estab¬ 
lished here throughout the rest of our woi 4 c. 

( 1 } Let a ^ be the set of all complex numbers; If 
we Interpret x+y and ax as ordinary complex numerical 
addition and multiplication, bpcomes a complex vector 
space. 

( 2 ) Let ^ be the set of all polynomials with com¬ 
plex coefficients in a real variable t. (There is no 
deep reason for this arbitrary choice: It Is merely a 
matter of convenience for the purpose of giving exa]ig)les 
later). To make 7 ^ Into a cosnplex vector space we In¬ 
terpret addition and scalar multiplication as the' ordlna- 




Sg> EXAMPLES OF VECTOR SPACES 


3 


ry addition of two polynomials and multiplication of a 
polynomial by a complex number, respectively; the origin 
in ip is the polynomial Identically zero. 

Example ( 1 ) is too simple and example (2) too comp¬ 
licated to be typical of the main contents of this book. 
We give now another example of complex vector spaces 
which (as we shall see later) la general enough for all 
our purposes. 

( 5 ) Let ( 5 ^, n = 1 , 2, ..., be the set of all n- 
tuplea of complex numbers, x = j f ^ ..., if y « 

we write, by definition . 

0 !o,..«,o| 

It is easy to verify that all parts of our two 
axioms (A) and (B), § 1 , are satisfied, so that la a 
coniplex vector apace; it is usually called n-dimensional 
complex Euclidean space . 

( 4 ) For any positive integer n let ‘the 

set of all polynomials (with the same restrictions as in 
(2)) of degree ^ n-1, together with the polynomial 
identically zero. (In the usual discussion of degree-the 
de^;p 4 e the degree of this polynomial la not defined, so 
that we cannot say that it has degree ^ n- 1 .) With the 
same interpretation of the linear operations (addition 
and scalar multiplication) as in (2), is a complex 
vector space. 

( 5 ) A close relative of ^n 

n-tuples of real numbers, x With the 

same fomial definitions of addition and scalar multipli¬ 
cation as for excepting that we consider only real 

scalars a , the 8 X)ace the ordinary or real n-dimen ¬ 

sional Euclidean apace ^ la a real vector space. 


4 


I. SPACES 


55. COMMENTS ON NOTATION AND TERMINOLOGY 

A few comments on our axioms and notation. Those 
familiar with algebraic terminology will have recognized 
the axioms (A), § 1 , as the defining conditions of an 
abelian (commutative) group; the axioms (B) express the 
fact that the group admits scalars as operators. We use 
the "scalar** terminology to emphasize the fact that we 
are not even necessarily dealing with real or complex 
numbers. Ninety percent of the theory remains valid If 
we Interpret scalars as elements of any field. If 
scalars are elements of a ring a vector space Is some¬ 
times called a modul . 

Special real vector spaces (namely !R^) are famil¬ 
iar In geometry. There seems at this stage to be no ex¬ 
cuse for the apparently uninteresting Insistence on com¬ 
plex numbers. We hope that reader Is willing to take It 
on faith that we shall have to make use of deep proper¬ 
ties of complex numbers later, (conjugation, algebraic 
clos\ire), and that In both the applications of vector 
spaces to modem (qiiantum mechanical) physics and In the 
mathematical generalization of our results to Hilbert 
space, con^lex niimbers play an Important role. Their one 
great disadvantage Is the difficulty of drawlnc: nlctures: 
the ordinary picture (Argand diagram) of ( 5 ^ is In¬ 
distinguishable from that of IRg, and a graphic repre¬ 
sentation of <£ 2 seems to be out of human reach. On 
occasions when we have to use pictorial language we 
shall therefore use the teimlnology of 3^^ In 
and speak of for example, as a plane. 

Finally we comment on notation. We observe that the 
symbol 0 has been used In two meanings; once as a 
nimaber and once as a vector. To make the situation worse 
we shall later, when We Introduce linear functionals and 
linear transformations, give It still other meanings. 
Portxjnately the relation between the various Interpreta¬ 
tions of 0 Is such that after this word of warning no 




DEFINITION OF LIMEAR DEPETODEINCE 


confusion should arise from this practice. Another no- 
tationally happy circumstance is that -x (defined in 
§ 1 > (A)( 4 )) and (-i)x are the same thing. This is true 
since 

X + (-1 )x » lx + (-1 )x •= (1 + (-1) )x = Ox •» 0. 

§ 4 . DEFINITION OP LINEAR DEPENDENCE 

Now that we have described the spaces we shall woric 
with we must specify the i>elatlons among the elements of 
these spaces which will be of interest to us. Vector 
spaces are used to study linear problems: the general 
form of a linear relation is described in the following 
definition. 

DEFINITION. A finite set of vectors, 

, ...» Xj^, is linearly dependent if there 
exist scalars ot^, ..., a^, not all zero, 
such that 

Hi *1 = x^ + ... 0. 

If, on the other hand, H-i “i .^1 "■ 0 implies 
that “ ••• = “n “ vectors 

x^, ..., Xj^ are linearly Independent . 

Linear dependence or independence are properties of 
sets of vectors; it is customary however to apply these 
adjectives to vectors themselves and thus we shall some¬ 
times say "a set of linearly independent vectors" instead 
of "a linearly Independent set of vectors". It will be 
convenient also to speak of the linear dependence or in¬ 
dependence of a not necessarily finite set, X , of vec¬ 
tors. We shall say that I is linearly Independent if 
every finite subset of I la such; otherwise X is 
linearly dependent. 

To gain insist into the meaning of linear depend¬ 
ence let us study the examples of vector spaces that we 


6 


I. SPACES 


already have. 

( 1 ) If X and y are any two vectors In , 
then X and y fom a linearly dependent set. If x 
«» y ■■ 0 this Is trivial; If not we have, for example, 
the relation yx + (-x)y -« o. Since It Is clear that 
any set containing a linearly dependent subset Is Itself 
linearly dependent, this shows that any sp,t containing 
more than one element Is a linearly dependent set. 

( 2 ) More Interesting Is the situation In the space 
The vectors x - x(t) -1 - t, y •• y(t) - t(l -t^ 

and 2 z(t) - 1 -t are, for exanple, linearly dependent, 
since x + y- z-o. However the Infinite set of 
vectors x^, x^, Xg, ... defined by 

^^(t) - 1 , x^(t) - t, Xg(t) - t^, Xj(t) = t^, ..., 

Is a linearly Independent set, for If we had any relation 
of the form 

®o*o * *** **n*n ” 

then we should have a polynomial relation 

«o + ••• + V""- 

whence - a, - ♦ • • - - 0 

(' 3 ) As we mentioned before, the spaces are 

the prototype of what we want to study: let us examine, 
for exanple, the case n 3. To those familiar with 
higher dimensional geometry the notion of linear depend¬ 
ence In this space (or, more properly speaking. In Its 
real analog has a concrete geometric meaning which 

we shall only mention. In geometrical language: two 
vectors are linearly dependent If and only If they are 
colllnear with the origin, and three vectors are linearly 
dependent If and only If they are coplanar with the ori¬ 
gin. (If one thinks of a vector not as a i)olnt In a 
space but as an arrow pointing from the origin to some 
given point, the preceding sentence should be modified 
by crossing out the jflirase "with the origin" both times 
that It occurs). We shall presently Introduce the notion 



S 5 . CHARACTERIZATION OF LINEAR DEFETOOEMCE 


1 


of linear manifolds (or vector aubsi>aces) In a vector 
space and use the geometrical language thereby suggested. 

For a concrete example consider the vectors x - 
fi, 0 , o!, y =• |o, 1 , o| z - jo, 0 , i!, and u » fi, i,l |. 
These four vectors fom a linearly dependent set, since 
x+y + z- u- 0 ; It la easy to verify however that any 
three of these vectors fom a linearly Independent set. 

§ 5 . CaiARACTERIZATION OP LINEAR DEPENEENCE 

Returning to the general considerations we shall 
say, whenever x = ® *** + “n^' x is a 

linear combination of x^, ..., x^; we shall use without 
any further explanation all the simple grammatical Impli¬ 
cations of our terminology. Thus we shall say. In case 
X la a linear combination of x^, ..., x^^, that x la 
linearly dependent on x^, ..., x^^; we shall leave to the 
reader the proof of the fact that If x^, ..., x^^ are 
linearly Independent then x Is a linear combination of 
them If and only If the vectors x, x^, ..., x^ are 
linearly dependent. 

The fundamental theorem concerning linear dependence 
Is the following. 

THEOREM. The set of non zero vectors, 

X,, ..., x^, Is linearly dependent If and only 
If some Xj^, 2 ^ k ^ n, la a linear combination 
of the preceding ones. 

PROOF. Let us suppose that the set x^, ..., x^^ la 
linearly dependent and let k be the first Integer be¬ 
tween 2 and n for which x^, ..., 3 ^ Is linearly de¬ 
pendent. (If worse comes to worst the hypothesis of the 
theoz'em assures us that k ■« n will do). Then 

ot^x^ + ••• ® 

for a suitable set "ofct's; moreover, whatever the «»s. 



8 


I. SPACES 


» 0 la loipoaBible, for then we should have a linear 
dependence relation among , ..., ^, contrary to the 

definition of k. Hence 



as was to be proved. This proves the necessity of ovir 
condition; sufficiency la clear since, as we remarked be¬ 
fore, any set containing a linearly dependent set Is 
Itself such. 

$6. ISEPUnriON AHD CONSHRUCTION OF BASES 

BEPUTITIQN. A (linear) basis (or a 
coordinate system ) In a vector space D Is 
a set X of linearly independent vectors 
such that every vector in B Is a linear 
coniblnatlon of elements of X . A vector 
space B Is finite dimensional If It has a 
finite basis. 

Except for the occasional consideration of exanplea 
we shall restrict our attention, throu^^ut this book, 
to finite dimensional vector spaces. 

For exanples of bases and finite and non finite di¬ 
mensional spaces we t\mi again to the spaces and 

!P . In p the set 3^ - ^ 2,... 

is a basis: every polyntmilal is, by definition, a lin¬ 
ear combination of a finite nunber of 3^. Moireover p 
has no finite basis, for given any finite set of poly¬ 
nomials we may find a polynomial of hic^er degree than 
any of them: this latter polynomial is obviously not a 
linear coniblnation of the former ones. 

An e3canple of a basis In Is the set of vec¬ 

tors Xj,, 1 -• 1, ..., n, defined by the coxslition that 
the J-th coordinate of x^ is (Here we use for 

the first time the popular KroneCker d ; it is defined 



_ S 6 . DEFINITION AMD CONSTRUCTION OF BASES _ 2 _ 

•by <$ij ■ 1 if• 1 - J, and < 5 ij * O If 1 ^ j). Thus 
we assert that In ( 5 ^ the vectors " !i» o, o!, Xg » 
jo, 1 , 0 }, and x^ | o, o, i} are a basis. We have seen 
•before that they are linearly Independent; the formula 

^ * I ^ ^ ^2 ^2 •^ ^3 ^3 

proves that every x In la a linear combination of 

them. 

In a general finite dimensional vector apace D , 
with •basis x^, ..,, x^^, we know that every x may be 
written In the foim 

* " 51.1 

we assert that the £«a are uniquely determined by x. 
The proof of this assertion Is an argument often used In 
the theory of linear dependence. If we had x •• 

51 1 T^i Xj^, then we should have, by subtraction, 

Z i( i 1 " hj)*! " °* 

Since the x’s are linearly Independent, this Implies 
that “ 0 for 1 “ 1 , ..., n: In other words the 

T)'s are the same as the Va . 

THEOREM. If T Is a finite dimensional 
vector space and •••» yjn ®®y s®'*- 
linearly Independent vectors In B, then, 
unless the y*a already form a basis, we can 
find vectors y^^,, ..., yj„^p so that the 
totality of y’s, y,, ..., y„,* ym+i* •••* 

y . forms a basis. In other words; every 
linearly Independent set can be extended to a 
•basis. 

FROQP. Since D is finite dimensional It has a 
finite basis, say x^, ..., we consider the set 6 
of vectoirs 

y^# •••» yin» *1* •••» *11* 


10 


I. SPACES 


In this order, and we apply to this set the theorem of §5 
several times in succession. In the first place the set 
6 is linearly dependent since the y’s are (as are all 
vectors) linear combinations of the x’s. Hence some 
vector of 6 is a linear combination of the preceding 
ones; let z be the first such vector. Then z is dif¬ 
ferent from any y^^, 1 « 1 , ..., m, (since the y’s are 
linearly Independent), hence z is equal to some x, 
say z * Xj,. We consider the new set ( 5 * of vectors 

y,* •••< yjn» Xj^. 

We observe that every vector in B is a linear combina¬ 
tion of vectors In 6 ' since by means of y^, ..., y^^^, 
x^, x^_^ we may express and then by means of 

x^, x^, x^, we may express any 

vector. (The x's form a basis). If 6' Is linearly 
independent, we are done. If It Is not, we apply the 
theorem of §5 again and again the same way until we 
reach a linearly Independent set, containing y^, ... ,yjjj. 
In terms of which we may express every vector In D . 

This last set Is a basis containing the y's. y 

§7. DIMENSION OP A VECTOR SPACE 

THEOREM 1 . The number of elements In any 
basis of a finite dimensional vector space la 
the same as In any other basis. 

PROOF. The proof of this theorem la a sll^t re¬ 
finement of the theorem of §6 and proves, in fact, a 
slightly stronger assertion. Let X ■ (x,, .... and 
V “ (Ji» •••# y^j) two finite sets of vectors each 
with one of the two defining properties of a basis; i.e. 
we assume that every vector in B is a linear combina¬ 
tion of the x's (but not that the x's are linearly 
Independent), and we assume that the y's are linearly 
Independent, (but not that every vector la a linear com- 




DIMENSION OF A VECTOR SPACE 


2± 

blnatlon of them). We may apply the theorem of §5, just 
as above, to the set S: 

^1, x^. 

Again we know that every vector is a linear combination 
of vectors of 6 and that 6 is linearly dependent. 
Reasoning just as before we obtain a set 

^i-l> ^i+l> •••> 

again with the property that every vector is a linear 
combination of vectors of (5*. Now we write in 

front of the vectors of 6* , and apply the same argument. 
Continuing in this way it is clear that the x*s will 
not be exhausted before the y*s are, since then the 
remaining y*s would have to be linearly Independent, 
and at the same time linear combinations, of the ones al¬ 
ready Incorporated into 6. in other words after the 
argument has been applied m times we obtain a set with 
the same property the x *3 had, and this set will differ 
from the x*s in .that m of them have been replaced'by 
y*s. This seemingly Innocent statement is what we are 
after: it implies that n ^ m. Consequently if I and 
9 are both bases (so that they each have both proper¬ 
ties) then n ^ m and m ^ n. 

DEFINITION. The number of elements in a bcK 
basis of a finite dimensional vector space D 
is the linear dimension of . 

This definition (togetner with the fact that we 
have already exhibited, in §6, one particular basis in 
C5 ^ ) finally justifies our terminology and enables us 

to announce the pleasant result: n-dlmenslonal Euclidean 
space is n-dlmenslonal. 

As a corollary of Theorem 1 we have 

THEOREM 2. Any n+1 vectors in an n- 


12 


I. SPACES 


dimensional vector space ID are linearly de¬ 
pendent. A set of n vectors In D Is a 
basis If and only If It Is linearly Independ¬ 
ent, or, alternatively. If and only If every 
vector In D Is a linear combination of ele¬ 
ments of the set. ^ 

58 . ISOMORPHISM OP VECTOR SPACES 

As an application of the notion of linear basis, or 
coordinate system, we shall now fulfill one of our 
promises by showing that every finite dimensional vector 
space Is essentially the same as some To make this 

statement precise we lay down the follow!^ definition. 

DEFINITION. Two vector sx>aces U and 
D are Isomorphic If there Is a one to one 
correspondence between the vectors x of U 
and the vectors y of D , say y - T(x), 
such that T( a^x, * = ot,T(x, ) +agT(x2). 

In other words U and ID are Isomorphic If 
there Is a one to one corresiiondence (an Iso¬ 
morphism ) between their el^nents which preserves 
all linear relations. 

It Is easy to see that Isomoiidilc finite dimensional 
vector spaces have the same dimension: to a basis In one 
space there correai>onds a basis In the other space. Thus 
dimension Is an Isomorphism invariant: we shall now show 
that It Is the only Isomorphism Invariant, l.e. that any 
two c<mDplex vector spaces of the same finite dimension 
are Isomorphic. Since the Isomorphism of U. and S 
on the one hand, and of ID and D) on the other hand, 
implies that U and ID are Isomorphic, It will he suf¬ 
ficient to prove the following theorem. 




88. ISOMORFHiaM OP VECTOR SPACES 


1 


THEORBt. Every n-dlmenslonal complex 
vector space S Is Isomoridilc to . 

PROOF. Let x^, ..., x^^ be any basis In B. For 
any x in B we have x - i^x, + ... + and we 

know that the numbers 1^^ are uniquely deter¬ 

mined by X. We consider the one to one correspondence 

X J -^1» • • •# 

between B and If y- ti^x^+*»*+ then 

ax + Py - ( a + ( 5 ^, )x^ + ••• + (af^^ + Pnn)Xj^: 

this establishes the desired Isomorphism. 

One ml£^t be tempted to say that from now on It would 
bo silly to try to preserve an appearance of generality 
by talking of the general n-dlmenslonal vector space, 
since we know that from the point of view of studying 
linear problems Isomorphic vector spaces are Indistin¬ 
guishable, and we may as well always study (S^. There Is 
one catch. The most Isqwrtant properties of vectors and 
vector spaces are those which are lndei>endent of coordi¬ 
nate systons, or. In other words, those which are Invar¬ 
iant under Isomorphisms. The correspond«ice between B 
and was, however, established by choosing a coordinate 
syston: were we always to study we would always be 
tied down to this partlciilar coordinate system, or else 
we would always be faced with the problem of showing that 
o\ir definitions and theorems are Independent of the 
coordinate system In idileh they are stated. (This horrible 
dllenna will become clear later on the few occasions when 
we shall have to use a particular coordinate system to 
give a definition). Accordingly In the greaten part of 
this book we shall Ignore the theorem j\ist proved and 
treat n-dlmenslonal vector spaces as self respecting en¬ 
titles, independently of any basis. Besides the reasons 
just mentioned there is another reason for doing this: 
many special examples of vector spaces, such for exaiig)le 


^k 


I. SPACES 


as would loae a lot of their Intuitive content If 

we were to transform them Into and apeak only of 
coordinates. In studying vector spaces, such as p , 
and their relation to other vector spaces. It la Import¬ 
ant to he able to handle them with equal ease In dif¬ 
ferent coordinate systems, or, and this la essentially 
the same thing, to be able to handle them without using 
any coordinate system. 


§9. LINEAR MANIFOLDS 

The objects of Interest In geometry are not only the 
points of the space under consideration, but also Its 
lines, planes, etc. The analogs. In general vector 
spaces, of these hl^er dimensional elements are lnti*o- 
duced by the following definition. 

DEFINITION. A non empty subset TR of a 
vector space D Is a linear manifold If along 
with every pair, x,y, of vectors contained In 
771 , every linear combination, ax + py. Is 

also contained In 771 . 

A word of warning: along with x a linear manifold 
also contains x - x *• o. Hence If we Interpret linear 
manifolds as lines, planes, etc., we must be careful to 
consider only lines and planes that pass throu^ the 
origin. 

A linear manifold 7R In a vector si>ace !D Is It¬ 
self a vector space: we leave It to the reader to verify 
that with the same definitions of addition and scalar 
multiplication as we had In 73 , 7 Jl satisfies the axioms 
A and B of 51. 

Two special examples of linear manifolds are, (1) 
the set 0 consisting of the origin only, and (11) the 
whole space . Less trivial examples are the following. 
(1) Let n and m be any two positive Integers, 



§10. CALCULUS OP LINEAR MANIFOLDS 


11 


m ^ n. Let 771 be the set of all vectors x = 
in (5„ for which f « ... = = o. 

-t* I m 

(2) With m and n as before, we consider the 
apace and any m real members, t.^, Let 

W be the set of all vectors x = x(t) In for 
which x(t.| ) *» ...= x(tj^) = 0, 

(5) Let 7n be the set of all vectors x » x(t) in 
TJ for which x(t) s x(-t) holds identically in t., 

We need some notation and terminology. For any col¬ 
lection, 77?^ , of subsets of a given set, say, for example^ 
for a collection of linear manifolds in a vector apace 
1} , we write for the intersection of all 

Tfly , l.e. the set of points common to all of them. 

Also if 771 and 7l are subsets of a set we write 7)7 c 71 
if 771 la a subset of 7) , i.e. if every element of 777 
lies also in 71 , (Observe that we do not exclude the 
possibility 777 = 7) ; thus we write 7) c: D as well 
as 0 c TO ), For a finite collection, 771^, 771^, 

we shall write 777^ n ... n 77) in place of 77) 
in case two linear manifolds, 77) and 7) p are such that 
77) n 7) = 0 we shall say that 77) and 7) are dia - 

.ioint . 


§10. • CALCULUS OF LINEAR MANIFOLDS 

THEOREM 1. The Intersection of any col¬ 
lection of linear manifolds is a linear mani¬ 
fold. 

PROOF. If we use an index y to tell apart the 
members of the collection, so that the given linear mani¬ 
folds are TJly , let us write 

■m = . 

Since every Tlty contains o, so does , hence is 
non enapty. If x and y belong to TO (i.e. to all 
), then ax + fly also belongs to all — 



16 


I. SPACES 


hence 7n is a linear manifold. 

To give an application of this theorem, suppose 
that 6 is an arbitrary set of vectors (not necessarily 
a linear manifold) in a vector space B. There certain¬ 
ly exist linear manifolds 971 containing every element 

of 6 , 6 c TW -the whole 3j>a.ce B is, for 

example, such a linear manifold. Let us denote by TTl 
the Intersection of all linear manifolds containing 6 : 
it la clear that 7fl is Itself a linear manifold con¬ 
taining 6. 771 la, moreover, the smallest such manifold: 

if 6 is also contained in the linear manifold 7\ , 

^ c 71 , then 771 c 7i . The manifold 771 so 

defined la called the linear manifold spanned by 6 . 

The connection between the notion of spanning and the con¬ 
cepts studied in §§4-8 is the following. 

THEOREM 2. If 6 Is any set of vectors 
in. a vector si>ace D , and Wl la the linear 
manifold spanned by Q , then jn Is the same 
as the set of all linear combinations of ele¬ 
ments of 6 . 

PROOF. It Is clear that a linear combination of 
linear combinations of elements of 6 may again be 
written as a linear combination of elements of € . 

Hence the set of all linear combinations of elements of 
6 Is a linear manifold containing 6 : It follows 
that this manifold must also contain TR. Now turn the 
argument arovind: TR contains 6 and Is a linear manifbM» 
hence 7R contains all linear combinations of elements of 
6 

We see therefore that In our new terminology we may 
define a linear basis as a set of linearly Independent 
vectors which spans the idiole space. 

As an easy consequence, lAiose proof we leave to the 
reader, of Theorem 2 we have: 



§11. DIMENSION OP A LINEAR MANGOLD 


17 


THEOREM 3 . If R and are any two 
linear manifolds and Tfl Is the linear mani¬ 
fold spanned hy R and (5 together, then 
m Is the same as the set of all vectors of 
the form x+y, with x In R and y In R . 

Prompted by this theorem we shall use the notation 
R + R for the linear manifold 7n spanned by fi 
and ^ . 

§ 11 . DIMENSION OP A LINEAR MANIFOLD 

THEOREM 1 . A linear manifold 971 In an 
n-dlmenslonal vector space D Is a vector 
space of dimension ^ n. 

PROOF. It Is possible to give a deceptively short 
proof of this theorem, that runs as follows. Every set 
of n +1 vectors In D Is linearly dependent, hence the 
same Is true of 771 ; hence In particular the number of 
elements In any basis of W Is ^ n; Q.E.D. 

The trouble Is that we defined dimension n by re¬ 
quiring In the first place that there exist a finite 
basis, and then demanding that this basis contain exactly 
n elements. The proof above shows only that no basis 
can contain moi*e than n elements; It does not show that 
any basis exists. Once the difficulty Is obseirved, how¬ 
ever, It Is easy to fill the gap. If TO 0 then TO 
Is o-dlmenslonal and we're done. If TO contains a 
vector x^ 5 * O, let TO^ c TO be the linear manifold 

spanned by x^. If TO, *» TO , TO Is 1 -dimensional, 
and we are done. If TO, 3^ TO , let Xg be an element 
of TO not contained In TO,, and let TO^ be the linear 
manifold si>anned by x, and Xg; and so on. Now we may 
legitimately employ the argument given above: after no 
more than n steps of this sort the process reaches an 


18 


I, SPACES 


end^ since (by Theorem 2, §7) we cannot find n+l linear¬ 
ly Independent vectors. 

An important consequence of this second and correct 
proof of Theorem 1 is the following. 

THEOREM 2. Given any linear manifold 
7f\ in an n-dlmenslonal vector space 73 we 
may find a basis x^, x^ 

in 73 , so that , ..., are in 797 and 
form, therefore, a basis in Tfl . 

It is easy to manufacture examples illustrating the 
concepts and results of the last two sections. One ex¬ 
ample is this: a polynomial (for us a vector x = x(t) 
in ) is called even if x(-t) = x(t), (see §9,(3)) 
and it is called o^ if x(-t) s -x(t). Both the class 
of even polynomials and the class of odd polynomials are 
linear manifolds in !p ; these two linear manifolds, say 
997 and 97 , are disjoint, and 997 + 77 = !p . 

(Proof?) 


§12. CONJUGATE SPACES 

DEFINITION. A linear functional , y * y(x), 
on a vector space !D , is a scalar valued f\mc- 
tlon defined for every vector x, with the prop¬ 
erty that (identically in the vectors x, and 
Xg and scalars and oc^) 

y( ^ +a 2 y(Xg). 

Let ua look at some exaji^les of linear fmctionals. 
(1) For X = I ^ 

y(x) - 1^. More generally, let a^, be any n 

complex numbers, and define, for 

X ^ I ... f y(^)“ + ••• + ®n^n* 

We observe that In any vector space and for any 



12. CONJUGATE SPACES 


19 


linear functional y, 

y(o) = y(o*o) = 0*y(0) = o; 

hence a linear functional aa we defined It la aometlmea 
called hotnogeneoua . In particular In C ^ 

y(x) - f^ + ... + of^ + fi 

is not a linear fimctlonal, lonlesa ft- o. 

(2) For X » x(t) in p , define y(x) = x(o)* 

More generally for any n real numbers 

and any n coinplex numbers define 

y(x) « a^x(t^) + ••• +a^x(t^). 

Another example. In a sense a limiting case of the one 
just given. Is the following. Let (a,b) be any Interval 
on the t axis, and let ct(t) be any complex valued 
Integrable function defined for t In (a,b). We write 
for every x = x(t) In P , 

y(x) = (t)x(t) dt. 

a 

(3) In an arbitrary vector apace D , define 
y(x) = 0 for every x In o . 

Thla laat exaji 5 )le la a flrat hint of a general alt- 
uatlon. Let D be any vector apace and let D' be the 
collection of all linear functlonala on D . Let ua 
denote by 0 the linear functional defined In (3) (com¬ 
pare the comment at the end of §3); If y^ = y^Cx) and 
^2 ” “2 scalara, let ua 

denote by y^ the expreaalon 

yj “ y3(^) “ aiyi(3c) + 

It la easy to see that y^ Is a linear functional; we 
denote It by a^y^ + 

With these definitions of the linear concepts, 

(zero, addition, scalar multiplication), forma a vec¬ 
tor apace, the conjugate apace of D . 



I. SPACES 


23 . 


§ 13 . NOTATION FOR LINEAR FUNCTIONALS 

Before studying linear functionals and conjugate 
spaces In more detail we wish to Introduce a notation 
that may appear weird at first slgjht but which will clar¬ 
ify many situations later on. We have already used the 
trick of denoting such a composite object as a linear 
functional, y(x), by a single letter y. Sometimes, how¬ 
ever, It Is necessary to use the fimctlon notation and 
indicate somehow that If y Is a linear functional on 
T) and If x la a vector In D , then y(x) Is a | 
fixed scalar. The notation we propose to adopt here. In¬ 
stead of writing y followed by x In parentheses, la 
X and y enclosed between square brackets and separated 
by a comma: In other words we shall write [x,y] In 
place of y(x). Because of the unusual nature of this 
notation we shall expend on It some further verbiage. 

As we have just pointed out [x,y] Is a substitute 
for the function notation y(x); tx,y] Is the scalar 
value we obtain If we take the value of the linear func¬ 
tional y at the vector x. Let us take an analogous 
situation (but not concerned with linear functionals). 

'Let X be a real vaidable; let y - y(x) be the func¬ 
tion y(x) - x^. The notation [x,y] Is a symbolic way 
of writing down the recipe for the actual operations 
performed; It corresponds to the sentence [take a nimiber 
X, and square It]. 

Using this notation we may sum up: to every vectorn 
space *0 we make correspond the conjugate space *0' con-t 
slstlng of all linear functionals on D ; to every pair i 
X and y, idiere x la a vector In B and y Is a 
linear functional In B*, we make correspond the scalar 
[x,y] defined to be the value of y at x. In tenns of 
the symbol [x,yl the defining property of a linear func¬ 
tional Is 

( 1 ) + «2*2*7J “ a,[x,,y] + « 2 [Xg,y], 



SU. BASES IN CONJUGATE SPACES 


21 


and the defini tion of the linear operations on linear 
fimctlonala Is 

(2) [X, o(,[x,y^]+ 

The two relations together are expressed by saying that 
[x,y] Is a bilinear functional of the vectors x In 0 
and y In D *, 

5U. BASES IN CONJUGATE SPACES 

One more word before embarking on the proofs of the 
Important theorems. The concept of conjugate space was 
defined without any reference to coordinate systems; a 
glance at the following proofs will show a superabund¬ 
ance of coordinate systems. We wish to point out that 
this phenomenon Is Inevitable In this case: we shall be 
establishing results concerning dimension, euid dimension 
Is the one concept (so far) whose very definition la 
given In terms of a basis. 

THEOBPM 1. If T3 Is an n-dimensional 
vector space. If I - (x^, ..., x^^) Is a 
basis In U , and If , ...» a Is any 
set of n scalars, then there la one and only 
one linear functional y on B such that 
(x^,y] for l“i, ..., n. 

PROOF. Every x in B may be written In the form 
X — f^x^ + ••• + fjjXj^ III on® only one way; If y 
la any linear frmctlonal then 

[x,y] - f,[x, ,y] + ••• + InlXj^.yl. 

Prom this relation the uniqueness of y la clear: if 
[x^,y] - a^, then the value of [x,y] Is determined for 
every x by [x,y] - Hili®!* The argument can also 
be t\imed around: If we define 

(x,y] - o, + ••• + ln“n 
then y is indeed a linear functional and ■■ ®^1* 



22 


I. SPACES 


THEOREM 2. If Tl l3 an n-dimensional 
vector space and I « (x^, x^) I 3 a 

basis in D , then there is a uniquely de¬ 
termined basis I* in ID*, 1* =* (y^, 
called the dual basis of X , with the prop¬ 
erty that Consequently the 

conjugate space of an n-dlmensional vector 
space is n-dimensional. 

PROOF. It follows from Theorem 1 that for each 
j « 1 , ..., n a unique y^ in D * can be found for 
which “ ^Ij^ have only to prove that the set 

^ •••f l-s a basis in B’ . 

In the first place I’ is a linearly independent 
set, for if we had a^y^ + ••• + “ 0, in other 

words if 

[X, a^y^ + •.. + oc^J^l » a^[x,y^] + ••. + aj^[x,y^] = 0 
for all X, then we should have, for x « x^^, 

" - Z J - Zj «J <1J - 

In the second place every y In 2*Is a linear 
combination of y^, ..., y^^. To prove this write 
then for x - Z ^1*1 

(1) [x»y] - “i + ••• + ^n“n' 

On the other hand 

(2) [x,yj] - El ^iCx^yj] - Ij 

so that, substituting in (1), 

[x,y] - a^[x,y, ] + .•• + 

- U, «,T, ♦ ... t a„yj. 

Consequently y - ot^y^ + ••• + «jjy^# and the proof of 
the theorem Is complete. 

We shall need also the following easy consequence of 
Theorem 2. 



§15. REFLEXIVITY OF FINITE DIMENSIONAL SPACES 


23 


THEOREM 3. If x' and x'' are any 
two different vectors of the n-dlmensional 
vector apace B, then there exists a linear 
functional y on T1 such that 
[x',7] ’f [x'Sy]; or, equivalently, to any 
non zero vector x in B there corresponds 
a y in B' such that [x,y] 0. 

PROOF. That the two statements In the theorem are 
Indeed equivalent la seen hy considering x = x' - x''. 

We shall, accordingly, prove only the latter statement. 

Let X = (x^, ..., x^) he any basis In B , and 
let l'= (y^, ..., y^) be the dual basis In B'. If 
X = ^ 1 * 1 * ^hen (see (2)) [x,y .I = i.. Hence If 

t 3 c>yl = 0 for all y. In particular If (x,yjj = 0 for 
j = 1, ..., n, then x = 0. 

§ 15 . REFLEXIVITY OP FINITE DIMENSIONAL SPACES 

It la natural to think that If the conjugate space, 

B' , of a vector apace B , and the relations between 

a apace and Its conjugate aijace, are of any Interest at 
all for B , then they are of just as much Interest for 
B' .In other words we may form the conjugate space 
( B' )' of B' : for simplicity of notation we shall 
denote It by B''. The verbal description of an ele¬ 
ment of B'' is clumsy: such an element is a linear func¬ 
tional of linear functionals. It la, however, at this 
point that the greatest advantage of the notation [x,y] 
appears: by means of it, it Is easy to discuss B and 
its relation to t)''. 

If we consider the symbol [x,y] for some fixed 
y - Jq, we obtain nothing new: [x,yQ] la merely another 
way of writing the function y^ = y^Cx). If, however, 
wo consider the symbol tx,y] for some fixed x x^, 
then we observe that the function [XQ,y] of the vectois 


2k 


I. SPACES 


y In D* l3 a scalar valued function which Is linear 
(see *§ 13 , ( 2 )): In other words [x^^y] Is a linear 
fimctlonal on D’ and, as such, an element of D**. 

By this method we have exhibited some linear func¬ 
tionals on B : have we exhibited them all? For the 
finite dimensional case the following theorem furnishes 
the affirmative answer. 

THEOREM. If B Is a finite dimensional 
vector space then corresponding to every linear 
functional there Is a vector 

Xq In 0) auch that = [xQ,y] - yCx^); 

the correspondence «=s x^, the so-called 
natural correspondence between Ti'’ and B , 
la an isomorphism. 

PROOF. Let us view the correspondence from the 

standpoint of going from 13 to B": In other words to 

every x^ In 13 we make correspond a vector z^ In 

B" defined by = y(*o^ every y In B' . 

Since [x,y] = y(x) la linear In x, the transformation 

X. —» z„ Is linear, 
o o 

We shall show that this transformation Is one to 
one, as far as It goes: In other words If to x* In 
B there corresponds the linear fmctlonal z* »= z*(y) ■■ 
[x*,y] on B", and to x*’ there corresponds z'' •» 
z' '(y) ■■ [x* ',j], and If z' = z'' then x' x''. To 
say that z* = z" means that [x*,y] >■ [x*’,y] for 
every y In B*: It follows from Theorem 3 of §14 
that X' X' •. 

The last two paragra^dis together show that the set 
of those linear fxmctlonala z - z(y) on B' (l.e. ele¬ 
ments of B'*) which do have the desired form, z(y) - 
tx,y] for a suitable x In B , form a linear mani¬ 
fold In B*'idilch la Isomorphic to B and which la 
therefore n-dlmenslonal. But the n-dlmenalonallty of 



SI 6. ANNIHUATORS OP LINEAR MANIFOLDS 


25 


D Implies that of !D*, which In turn Implies that D’* 
Is n-dimensional, -It follows that D^must coincide with 
the n-dimensional linear manifold just described, and the 
proof of the theorem Is coir^Dlete. 

It Is Important to observe that this theorem shows 
not only that *0 and IJ** are isomorphic — this much is 
trivial from the fact that they have the same dimension — 
but that the natural correspondence Is an isomorphism. 

This property of vector spaces is called reflexivity ; 
every finite dimensional vector space is reflexive. 

It la extremely convenient to be mildly sloppy about 
ID ’ ’ : for finite dimensional vector spaces we shall 
identify D” with 13 (by the natural Isomorphism) and 
we shall say that the element ^ ** 

the same as the element x^. In this language it is very 
easy to express the relation between a basis I , in D , 
and the dual basis of Its dual basis, in 13*’: the* sym¬ 
metry of the relation [x^,yj] = shows that I** = 

I 

§16. ANNIHILATORS OP LINEAR MANIFOLDS 

DEFINITION. The annlhllator . 6 ^ , of 
any subset (5 of a vector space 13 , ( 5 
need not be a linear manifold). Is the set of 
all vectors y in 13 * for which [x,y] is 
identically zero for all x in 6 . 

Thus 0° = D' and 13° = 0 ( c b ' ). if 5 ] la 
finite dimensional and 6 contains a non zero vector, 
l.e. 6 0 , then Theorem 5 of §14 shows that 6 ^ ^ 

D * . 

THEOREM 1 . If 371 is an m-dimensional 
linear manifold in an n-dlmenslonal vector 
space p , then 371^ ie an n-m dimensional 
linear manifold in 13*. 



26 


I. SPACES 


PROOF. We leave It to the reader to verify that 
5R° (in fact 6°, for an arbitrary 6 ) la always a 
linear manifold; we shall prove only the statement con¬ 
cerning the dimension of 

Let I » (x^, x^) be a basis In B whose 

first m elements are In 7R (and are therefore a basis 
for 7n ); let I' •= (y^, y^^) be the dual basis in 

B ' . We denote by 971 the linear manifold (In B ' ) 
spanned by y^^^^, ..., y^^; clearly Tfl has dimension n-m. 
We shall prove that 771°= W . 

If X Is any vector In 771 , then x Is a linear 
combination of x^, x^^^, x = 

j = m+1, ..., n we have 

[x,yj] = Xi=i ?i[XjL,yj] = 0. 

In other words for j » m+l, n, j. Is in 7t\^; It 

follows that 7Jfl is contained In 977°, Tfl c: , Sup¬ 

pose on the other hand that y is any element of 997°. 
Since y, being in TJ* , is a linear combination of the 
basis vectors y.,, ..., y^^, we may write y = Y1 
Since, by hypothesis, y is in 777^ we have for every 1 r* 

1, ..., m 

in other words y is a linear combination of yjjj^^^ •••> 
y^^. This proves that y is in 977 , and consequently 
that 977° c fh , and the theorem follows. 

THEOREM 2. If 777 is a linear manifold 
in a finite dimensional vector space B then 

97lOO( „ ( 97|0 ) 0 ) « . 

PROOF. Observe that we use here the convention, 
established at the end of § 15 ^ Identifying B and B* *. 
By definition 9779^ is the set of all vectors x for 
which y in TO° implies [x,yl « 0 . Since, by definiticn 
of 977°, [x,y] » 0 for all x in 977 and all y in 



§17. DIRECT SUMS 


27 


W “ , It follows that 971 c 7n°° . The desired con¬ 
clusion now follows from a dimension argument. Let W 
be m-dimensional; then the dimension of 7)7° is n-m, and 
that* of 977° °l3 n-(n-m) = m. Hence 977 = 977°° , as was 
to be proved. 

To see an example of an annlhllator the reader 
ml^t determine what the most general linear fvinctional 
in (& looks like, (see §12, (i)) and then find the an¬ 
nlhllator of the manifold described in §9,(1). Also: it 
is an easy consequence of Theorem 2 of this section tliat 
for any subset 6 of a finite dimensional vector space, 
6° ° is the same as the linear manifold spanned by 6 . 
Proof? 


§17. DIRECT S1374S 

In this section we sTiall describe a general method 
of making new vector spaces out of old ones. 

DEFINITION. If U and 93 are any two 
vector spaces their direct sum , 933 = U- ® 9Q , 
la the set of all pairs {x,y| with x in U. 
and y in 93 , and with the linear operations 
defined by the formula 

C(^|x^,y^i ^ ^ ^1^1 ^^2^2 ^ ^1^1 2 ^ 

We observe tliat the foimatlon of the direct sum la 
analogous to the way in which the Euclidean plane la con¬ 
structed from the x and y axes. 

We investigate the relation of this notion to some 
of our earlier ones. 

The set of all vectors (in 930 ) of the fonn {x,0i 
is a linear manifold in 933 ; the correspondence {x,o| 

X shows that this linear manifold is Isomorphic to 
tL . It is convenient, once more, to Indulge in a logi¬ 
cal Inaccuracy and to identify x and (x,ol and speak 


28 


I. SPACES 


of tl as a linear manifold In TD . Similarly, of course, 
the vectors y of B may he identified with the vectors 
of the form lo,jj in 133 , and we may consider V as 
a linear manifold in Hi. This tennlnology is not f to 
be sure, quite exact, but the logical difficulty is much 
easier to get around here than it was in the case of the 
second conjugate space. We could have defined the 
direct sum of VL and 13 to be the set consisting of all 
x's in U. , all y’s in "D , and all those pairs fx,y| 
for which X 5 ^ 0 and j ^ o. This definition yields a 
theory analogous in every detail to the one we shall de¬ 
velop, but it makes it a nuisance to prove theorems be¬ 
cause of the case distinctions it necessitates. It is 
clear, however, that from the point of view of this def¬ 
inition U. is actually a subset of II ® . In 

this sense then, or in the isomorphism sense of the def¬ 
inition we did adopt, we raise the question: what is the 
relation between U and li|| when we consider these 
spaces as subspaces of the big space IQ ? 

THEOREM. If U and 13 are linear mani¬ 
folds in a vector space 133 , then the following 
throe conditions are equivalent. 

(1) 133 - a ® 13 

(2) U and 13 are disjoint and 1X3 ■ U. 

+ 13 , (510); 

(3) Elvery vector z in 113 may be written In 

the fom z >■ x+y, with x in ti and y in 
13 , in one and only one way. 

PROOF. We shall prove the implications (1) => 

(2) =* (3) => (1). 

(1) => (2). Ws as8\ims that 133 -■ U 0 13 . If 

z - jx,yl lies in both U. and 13 then x - y - o, so 
tliat z B 0, and U and 13 are disjoint. Since the rep¬ 
resentation z - {x,o| + io,y| > x-i-y la valid for every 
z. it follows also that U -f 13’ ■■ IB. 


_ SI 8. DIMEN3I0M OP A DIRECT SUM _22 

(2) =:^-(3). If we asavune ( 2 ), so that In particular 
It + H ■■ ID ^ then It la clear that every z in ID 
haa the desired representation, z = x+y. To prove 
uniqueness we suppose that z ■■ x'+y'; x and x' are in 

tt and y and y' are in “D . It follows from x+y ■» 
x'+y' that X - x' - y' - y. Since the left member of 
this last equation la in U and the ri^t member Is In 
D , the disjointness of tl and D In^illea that 
X = x' and y - y'. 

(3) =* (1). This implication is practically In¬ 
distinguishable from the definition of direct svun. If 
we form the direct sian, U @ H , and then identify 
|x,oj and {o,yi with x and y respectively, we are 
committed to Identifying the sum |x,yt = {x,Oi + {0,y| 
with what we are assuming to be the general element z - 
x+y of HI ; from the hypothesis that the z - x+y rep¬ 
resentation Is unique we conclude that the correspondence 
between fx,o| and x (and also between {o,yI and y) 
Is one to one. 

§18. DIMENSION OP A DIRECT SUM 

What ceui be said about the dimension of a direct 
sum? If 11 Is n-dimensional, B Is m-dimenslonal, and 
ID - U 0 B , what Is the dimension of B3 ? This 
question Is easy to answer. 

Vx^THEOREM 1. The dimension of a direct sum 

Is the s\mi of dimensions of Its s\m]mands. 

PROOF. We assert that If x^, ..., x^ Is a basis 
In U, and If y^, ..., y^ la a basis In B , then the 
set x^, ..., Xjj, y,, ..., y^u (or, more precisely, the 
set |x^,oi, ..., fXj^,oI, jo,y,t, ..., lo,yjj,!) Is a basis 
In TD . The easiest proof of this assertion is to use 
the Implication (1) ( 3 ) from the theorem of the pre- 


I. SPACES 




ceding section. Since every z In TU may be written In 
the form z x+y, where x Is a linear combination of 
x^, ..., x^> and y la a linear combination of y^, ..., 
yjjj. It follows that our set does Indeed span ID . To 
show that the set la also linearly Independent, suppose 
that 

“l*l + ••• +“n''n + ^^1 + ••• + Vm" °- 

The vinlquenesa of the z « x+y representation Implies 
that 

“i*i + ••• + fiiyi + ••• + 

and hence the linear Independence of the x's and of the 
y's Implies that 

“l “ ••• ““n “ Pi “ ••• “ Pm “ 0- 

THEOREM 2. If ID la any n+m dimensional 
vector space, and tt Is any n-dlmenslonal 
linear manifold In ID , there exists an n-dl- 
menslonal linear manifold D in ID such that 
m = ll 0 D. 

PROOF. Let x^, ..., x^^ be any basis In U. ; by 
the theorem of §9 we may find a set y^, • • • > of 
vectors In ID with the proi)erty that x^, ..., x^^, y^, 
..., Jjjj la a basis in ID . Let D be the linear mani¬ 
fold spanned by y^, •••> y^s we omit the verification 
that ID - a e D . 

We observe that there la no reason to expect d to 
be Txnlquely determined: It la easy to construct examples 
to show that Indeed It Is not. We consider a few ex¬ 
amples of direct sums. 

(1) Let U be the linear manifold of all vectors 

I I,, ^n+l, •••* ^n+m» ^ ®n4m which 

«... - Iji+jn “ 0; let D be the manifold for which 

ii " ••• “ ^n “ 



§19. CONJUGATE SPACES OP DIRECT SUMS 


31 


(2) For a fancier example'we consider "p , and we 
let U be the set of 6ven polynomials, and D the set 
of odd polynomials. (See §ll). The relation 

x(t) » ^|x(t)+x(-t)! + ■l|x(t)-x(-t)|, 
valid for every polynomial x(t), shows that U. © D 
- P . 

(3) Let U be the set of vectors f 1^, Igl In 

(6 „ for which f = O; let D be the vectors for which 
I .j = 0 (aee (1 ) above); and let D be the vectors for 
which f, = Ig. Then ^ U®D = a®S. 

Plctiire ? 

§19. CONJUGATE SPACES OF DIRECT SUMS 

In most of what follows we shall view the notion of 
direct sijm as defined for linear manifolds In a vector 
apace D ; this avoids fussing with the Identification 
convention of §17 and turns out, incidentally, to be the 
more useful concept for our later work. We conclude, for 
the present, our study of direct sums, by observing the 
simple relation connecting conjugate spaces, annlhllators, 
and direct sums. To emphasize our present view of direct 
summation we return to the letters of our earlier nota¬ 
tion. 


THEOREM. If 7fl and 71 are linear mani¬ 
folds In a vector space D , and If T3 = 

TH 0 9*1 then la Isomorphic to 71^ and 
W ' to , and 13' = 

PROOF. To alagillfy the notation we shall use, 
throu^out this proof, x, x', and x° for elements of 
y([ ,7n’ » and TO°, respectively, and we reserve, simi¬ 

larly, the letters y for Tfl and z for D . (This 
notation la not meant to siiggest that there la any par¬ 
ticular relation between, aay, the vectors x In W 



52 


I. SPACES 


and the vectors x' In jn'*) 

If z' belongs to both 57J°and jf , l.e. If z’(x) 
® z*(y) “ Of then z*(z) •= z'(x+y) - O, so that TM^and 
7)0 are disjoint. If, moreover, z' Is any vector In 
D' , we define for every z <= x+y, x°(z) * z'(y) and 
y°(7) «* z'(x). It is easy to see that x° and y° are 
linear functionals on B (l.e. elements of B' ) belong¬ 
ing to 711° and 71° respectively; since z' » x°+y°. It 
follows that B' Is Indeed the direct stun of 7fl° and 71°. 

To establish the asserted isomoiijhlsma, we make cor¬ 
respond to every x° ay' In 71' defined by y'(y) - 
x°(y). We leave to the reader the routine verification 
that the corresixDndence x° —* y' Is linear and one to 
one, and therefore an Isomorphism, between 771° and 77' ; 
the corresponding result for 71 ° and 771' follows from 
symmetry by Interchanging x and y. 

We remark, concerning our entire presentation of the 
theory of direct sums, that there Is nothing magic about 
the number two: we could have defined the direct svim of 
any finite ninnber of vector spaces, and proved the ob¬ 
vious analogs of all theorems of the last three sections, 
with only the notation becoming more conpllcated. We 
serve warning that we shall use this remark later and 
treat the theorems It Implies as If we had proved them. 



Chapter II 


TRANSPOMHATIONS 

§20. DEFINITIONS AND EXAMPLES OP LINEAR TRANSFORMATIONS 

We come now to the objects that really make vector 
spaces Interesting. 

DEFINITION. A linear transformation (or 
operator ). A, on a vector apace D Is a cor¬ 
respondence which assigns to every vector x 
In D another vector, which we shall denote 
by Ax, In D , In such a way that, ldentlca|Lly 
In the vectors x, y and scalars cc, 0, we 
have 

(1) A( ax+ Py) - aAx+ PAy. 

We make again the remark that we made In connection 
with the definition of linear functionals, namely that 
for a linear transfonnatlon A, as we defined It, Ao- o. 
For this reason such transfonnatlons are sometimes called 
homogeneous linear transfonnatlons. 

Before discussing any properties of linear trans¬ 
fonnatlons we give several examples. We shall not 
bother to prove that the transformations we mention are 
Indeed linear: In all cases the verification of the 
validity of (i) la a simple exercise. 

(1) Two special transfonnatlons of considerable 
linpoz^ance for the study that follows, and for which we 
shall consistently reserve the symbols o and l respect¬ 
ively, are defined, for every x, by ox - o and lx -x. 


33 



34 


II. TRANSFORMATIONS 


(2) Let Xq be any fixed vector In T3 , and let 

y^ “ linear fimctlonal on D ; we define 

Ax « yQ(x)xQ. More generally: let x^, x^^ be an 

arbitrary finite set of vectors In D and let y^, 
be n linear functionals on *D ; we define Ax « 

y^(x)x^ + ••• + It Is ^ot difficult to prove 

that If, In particular, D is n-dimensional, and 
x^, ..., Xj^ Is a basis In D , then every linear trans¬ 
formation A has the form just described. 

(5) Let (j^, ..., i^) be any permutation of the 
first n positive Integers; for any x =■ | 1^, ..., 

In CE define Ax = | Similarly, let 

j(t) be any polynomial, with real coefficients. In the 
real variable t; for every x * x(t) In p , define 
Ax » x(j(t)). 

(4) For any x •» x(t) - In “Pn' 

fine Dx = ^ letter D here 

to remind the reader that Dx la the derivative of the 
polynomial x = x(t). We remark that we may have defined 
D on ^3 as well aa on : we ahall make uae of thla 
fact later. Observe that for polynomials the definition 
of derivation can be given purely algebraically, and does 
not need the uaxial theory of limiting processes.) 

(5) For every x - x(t) = ^ ^ 

(without subscript) we define Jx - 3+^ • 

(Once more we are disguising by algebraic notation a well 
known analytic concept. Just as In (4) Dx stood for 
dx/dt, so here Jx Is the same aa ^(B)ds.) 

(6) Let m(t) be any polynomial, with complex co¬ 
efficients, In the real variable t. (We may, althovi^ 

It la not particularly profitable to do so, consider m(t) 
aa an element of ^ ). For every x ■> x(t) In p we 
define Mx m(t)x(t). For later purposes we Introduce a 
special notation: In case m(t) >■ t, we shall write T 
for the operator M; thus Tx - tx(t). 

It la often useful to consider linear transfomatlons 




§21. LINEAR TRANSFORMATIONS A3 A VECTOR SPACE 


1 


(such, for example, as we mentioned In our definition of 
isomorphism) from one vector space to another; in this 
book we restrict ourselves to the more special situation. 

§21. lujear transformations as a vector space 

We proceed now to derive certain elementary proper¬ 
ties of, and relations between, linear transfonnatlona on 
a vector apace, ffore particularly we shall indicate 
several ways of making new transformations out of old 
ones; we shall generally be satisfied with giving the 
definition of the new transformations and we shall omit 
the proof of linearity. 

If A and B are any two linear transfonnatlons, 
we define their sum , S = A+B, by the eqimtlon Sx *» Ax + 
Bx (for every x). We observe that the commutativity and 
associativity of addition in B imply immediately that 
the addition of linear transfonnatlons is commutative 
and associative. Much more than this is true. If we 
consider the sum of any A and o (as defined in example 
(1), §20) we see that A + 0 =■ A. If for any A, we de¬ 
note by -A the tranafonnatlon defined by (-A)x >= -(Ax), 
we see that A + (-A) = o, and that -A, as so defined, 
is the only linear transfonnatlon B with the property 
that A + B = 0. To sum up; the properties of a vector 
space, described in axiom (A) of §l, appear again in the 
set of all linear transformations on the space; the set 
of all linear tranaformatlons is an abelian group with 
respect to the operation of addition. 

We continue in the same spirit; it will not, by 
now, surprise anybody if the axiom (B) of vector spaces 
is also satisfied by the set of all linear transfoima-r 
tlons. It is. For any A, and any scalar o, we define 
the product of A by ot, aA, by the equation (aA)x 
of(Ax). Axiom (B) is immediately verified; we sum up in 
the following theorem. 




36 


II. TRANSFORMATIONS 


THEOREM. The aet of all linear trana- 
formatlona on a vector ai>ace la Itaelf a vec¬ 
tor apace. 

We ahall uatially Ignore thla theorem; the reaaon la 
that we can aay much more about linear tranafonnatlona 
and the mere fact that they fonn a vector apace la uaed 
only very rarely. The "much more" that we can aay la 
that there exlata for linear tranafonnatlona a more or 
leaa decent definition of multiplication, which we dla- 
cuaa In the next aectlon. 

§22. PRODUCTS OP LINEAR TRANSFORMATIONS 

The product P of two linear tranafonnatlona A 
and B, P - AB, la defined by the equation Px - A(Bx). 

The notion of multiplication la fundamental for all 
that followa. Before giving any exanplea to illuatrate 
the meaning of operator producta, let ua obaerve the Im- 
pllcatlona of the aymbollam, P - AB. P la a tranafonna- 
tlon; l.e. given a vector x, P doea aomethlng to It. 
What It doea la found out by operating on x with B, 
l.e. finding Bx, and then operating on the reault with 
A. In other worda If we look upon the aymbol for an 
operator as a recipe for performing a certain act, then 
the product of two oi>erators la to be read from rl^t to 
left. P " AB means: "Operate first with B, and then 
with A". Thla may aeem like an undue amount of fuaa to 
raise about a small point; however, as we ahall soon see, 
operator multiplication la not, in gei»ral, comniutatlve, 
AB ^ BA, and It makes a lot of difference In which order 
we operate. 

The moat notorious exaaiple of non commutative oper¬ 
ators la found In the space p . We consider Dx - clx/dt 
and Tx •- tx(t), (see §20); we have: 

DTx - D(tx(t)) - x(t) + t(dx/dt), 

TDx - T(dx/dt) - t(dx/dt). 



23. POLYNOMIALS IN A LINEAR TRANSFORMATION 

In other words not only la It false that DT - TD (so 
that nr - TD - 0),. but in fact, for every x, (DT - TD)x 
■■ X, so that nr - TD = 1. 

On the basis of the examples In §20 the reader 
should be able to construct examples of non commutative 
operators. Those vdio are used to thinking of linear 
transformations geometrically can, for example, readily 
convince themselves that the product of two rotations of 
j (about the origin) depends In general on the order 
In which they are performed. 

The formal algebraic properties of numerical multi¬ 
plication are most of them (with the already mentioned 
notable exception of commutativity) valid In the opera- 


tor calculus. 

Thus we have: 


(1) 

A 0 -» 0 A * 0 


(2) 

> 

1 

> 

0 

> 


(3) 

A(B + C) = AB + 

AC 

(M 

(A + B)C = AC + 

BC 

(5) 

A(BC) » (AB)C. 



The proofs of all these Identities are Immediate 
consequences of the definitions of addition and multi¬ 
plication; to Illustrate the principle we prove (5)/ the 
associativity of multiplication. Let x be any vector, 
and denote by L and R the left and right sides of (5) 
respectively; we must show that Lx » Rx. We write 
y - Cx, z •» By, u ■■ Az; then (BC)x - By = z, so that 
Lx - A(BC)x - Az - u, 

Rx - (AB)Cx - ABy = Az «■ u. 

§ 23 . POLYNOMIALS IN A LINEAR TRANSFORMATION 

The associative law of multiplication enables us to 
write the product of three (or more) factors without any 
brackets; In particular we may consider the product of 
any finite number, say m, of factors all equal to A, 



58 


II. TRANSFORMATIONS 


This product depends only on A and m (and not, as we 
just remarked, on any bracketing of the factors); we 
shall denote It by a“. The justification for this no¬ 
tation Is that, althou^ In general operator multiplica¬ 
tion la not commutative, for the powers of one operator 
we do have the usual law of exponents, A^A™ = A^'*’™. We 
observe that A^ =* A; It la customary also to define 
A° ■> 1. With these definitions the calculus of powers 
of a single transformation la almost exactly the same as 
In ordinary arithmetic. We may In particular define 
polynomials In a linear transformation. Thus If p{ t ) = 
«o + ot^T + • • • + J.S any polynomial In the vari¬ 

able T (with scalar coefficients) we may form the 
linear transformation 

p(A) = ot^l + o^A + ••• + 

The rules for algebraic manipulation of such polynomials 
are easy. Thus p( t )q( x ) - r( t \ lnplles p(A)q(A) = 
r(A), (so that. In particular, any p(A) and q(A) are 
commutative); If p (t ) = l, then p(A) = l; If p(x ) = 
0, p(A) = o; If p( T ) + q( X ) = r( X ) then p(A) + 
q(A) - r(A). 

It Is not possible to give any sensible lnteiT)re- 
tatlon to p(A,B), If p( ) Is any polynomial In two 
variables, and A and B are any two linear trsuisfoim- 
atlons. The trouble, of course. Is that A and B may 
not commute, and even a simple monomial, such as o-^r , 

p 

will cause confusion. If p(u,x)-» err, what should 
we mean by p(A,B)? Should It be A^B, or ABA, or BA^ ? 
It la Important to realize that there Is a difficulty 
here; fortunately for us It la not necessary to try to 
get around It. We shall work with polynomials In sever¬ 
al variables only In connection with commutative trans¬ 
formations, and here everything la aljnple. We observe 
that If AB » BA then A^^^ — ^A^, and, therefore, 
p(A,B) has a well defined meaning for every p( ct,t ). 
The formal properties of the correspondence between 




§24, INVERSE OF A LINEAR TRANSFORMATION 


11 


transformations and polynomials are just as valid for 
several variables as for one; we omit the details. 

For an example of the possible behaviour of powers 
of a transformation we look at the differentiation op¬ 
erator D, on TP (or, just as well, in , for some n) 
It is easy to see that for any positive integer k, and 
any x = x(t) in p , D^x = d^x/dt^. We observe, that 
whatever else D does, it lowers the degree of the poly¬ 
nomial on which it operates by exactly one unit (assum¬ 
ing of course that this degree is ^ i)• Let x = x(t) 
be a polynomial of degree n-1, say; what is D^^^x? Or 
put it another way: what is the product of the two (com¬ 
mutative) operators and (where k is any 

Integer between 0 and n-1 ), considered on the space 
Pn ? mention this example to bring out the discon¬ 

certing fact implied by the answer to the last question: 
the product of two operators, neither of which is zero, 
may vanish, (Non zero operators whose product with some 
other non zero operator is zero are called divisors of 
zero. ,) 


§24. INVERSE OF A LINEAR TRANSFORMATION 

In each of the two preceding sections we gave an 
example: these two examples bring out the two nasty 

\ 

properties that the multiplication of linear transforma¬ 
tions has, namely non commutativity and the existence of 
divisors of zero. We turn now to the more pleasant prop¬ 
erties that linear transformations sometimes have. 

It may happen that the linear transformation A 
has one or both of the following two very special prop¬ 
erties, 

(1) x^ ^ Xg implies Ax^ 7 ^ Ax^. 

(2) To every vector y there corresponds (at least) 
one vector x such that Ax » y. 

If ever A has both these properties we shall say 
that A has an inverse ^ or that A is non singular . 



40 


II. TRANSFORMATIONS 


and we define a linear transformation, called the Inverse 
of A and denoted by A” \ as follows. If y Is any 
vector we may (by ( 2 )) find an x for which Ax ■> y. 

This X Is, moreover, uniquely determined, since x' ^ x 
Implies (by (l))that Ax' ^ Y = Ax. We define A~V to 
be X. To prove that A"^ Is linear, we evaluate 
A'’(o,y, + ot^yg)- If Ax^ - y^ and AXg = y^, then 
the linearity of A tells us that A( + “gXg) •• 

ct,yi + oc^Yq, 30 that A"’( ot^y, + ol^Yq) = “ 1*1 + 

otgXg “ y^ + OgA y^. 

As a trivial example of a transformation which has 
an Inverse we mention the Identity transformation i; 
clearly l ” ^ *= l. The transformation o does not have 
an Inverse; It violates both the conditions (l) and ^ 2 ) 
about as strongly as they can be violated. 

It la Immediate from the definition that for any A 
which has an Inverse we have 

AA"’ - A*’ A - 1; 

we shall now show that these equations serve to char¬ 
acterize A”\ More precisely: 

THEOREM 1. If A, B, and C are linear 

transformations such that 

AB “ CA 1, 

then A has an Inverse and A~’ - B ■■ C. 

PROOF. If Ax^ «= AXg, then CAx^ - CAXg, so that 
(since CA -• 1) x^ = x^j In other words the first con¬ 
dition of the definition of an Inverse Is satisfied. 

The second condition Is also satisfied, for If y Is 
any vector and x - By, then y ■■ ABy - Ax. Multiplying 
AB ■« 1 on the left, and CA ■ 1 on the rl^t, by A"\ 
we see that A~’ - B - C. 

To show that neither AB ■» l nor CA - l la by 
Itself sufficient to ensure the existence of A~\ we 




_ § 24 . INVERSE OF A LINEAR TRANSFORMATION _ 

call attention to the differentiation and Integration 
operators D and J, defined in §20, ( 4 ) and (5). Al- 
thou^ DJ *• 1, neither D nor J has an Inverse; D 
violates (1), and J violates (2). Query: Which of 
the other transformations defined In §20 have inverses ? 

In finite dimensional spaces the situation Is much 
simpler. 


THEOREM 2. A linear transformation A 
on a finite dimensional vector space H has 
an Inverse If and only If Ax = 0 Implies 
X = 0, or, alternatively, if and only if every 
y in 33 can be written In the form y = Ax. 

PROOF. If A”^ exists the condition is satisfied 
— this much Is trivial. Suppose now that Ax = 0 im¬ 
plies X = 0. Then x' ^ x*', l.e. x' - x" ^ 0 , implies 
A(x' - x'') ^ 0, l.e. Ax* i‘ Ax'this proves (l ). To 
prove (2), let (x^,Xg, ..., x^) be a basis in T 3 ; we 
assert that (Ax^, ..., Ax^^) is also a basis. Accord¬ 
ing to Theorem 2, §7, we need only prove linear independ¬ 
ence. But, Z a^Ax^ = 0 means A(Z^aj^Xj^) =• 0, and, 
by hypothesis, this ii^lles Z “l^l " linear 

-Independence of the x^ now tells us that = ••• = 

ct^ =» 0. It follows, of course, that every vector y 
may be written in the form y = Z ot^Ax^ = A( “i^l^ 
Let us assume next that every y is an Ax, and 
let (y^, ..., y^^) be any basis in 33 . Corresponding 
to each y^^ we may find a (not necessarily unique) x^^ 
for which y^ =• Ax^^; we assert that (x^, ..., x^^) is 
also a basis. For Z i“i^l “ ° implies = 

^l“i^l “ that = ••• = 0 . Consequently 

every x may be written in the form x = Z^ 

Ax = 0 Inplles, as in the argument just given, that 


X = 0. 



42 


II. TRANSFORMATIONS 


THEORB 4 3 . If A’"*' and B”** exist 
then (AB)*"^ exists and (AB)”^ « B^^A"”^; 

If A“^ exists and a ^ 0 then ( ocA)”^ 
exists and (aA)“^ = (l/a )A’’^; If A~^ 
exists then so does (A"^ and (A”^ “ A. 

PROOF. According to Theorem i. It Is sufficient to 
prove (for the first statement) that the product of AB 
with B”^A~\ In both orders. Is the Identity; this 
verification we leave to the reader. The proofs of both 
the remaining statements are Identical In principle with 
this one; the last statement, for example, follows from 
the fact that the equations AA’^ =» A’^A » 1 are com¬ 
pletely symmetric In A and A’\ 

We conclude our discussion of Inverses with the 
following comment. In the spirit of the preceding sec¬ 
tion we may If we like define rational functions of A, 
whenever possible, by using A~^. We shall not find It 
useful to do this except In one case: If A”^ exists, 
then we know that (A^)”^ also exists, for n « 1,2, 

...; we shall write A"^ for this latter transformation, 
so th8.t A*^ « (A^)”^ « (A*"^)^. 

§25. DEFINITION OF MATRICES 

Let us now pick up the loose threads: we have In¬ 
troduced the new concept of linear transformation and we 
must find out what It has to do with the old concepts 
of bases, linear f\mctlonals, etc. 

One of the moat Important concepts In the study of 
operators on finite dimensional vector spaces Is that of 
a matrix. Since this concept usually has no decent ana¬ 
log In Infinite dimensional 8i>aces, and since It Is pos¬ 
sible In most considerations to do without It, we shall 
try not to use It In proving theorems. It Is, however. 
Important to know what a matrix Is: we enter now Into 
the detailed discussion. 



§2^, DEFINITION OF MATRICES 


43 


DEFINITION. Let *D be an n dimensional 
vector space'and let I = (x^, x^) be 

any basis in !D ; let A be a linear trans¬ 
formation on Ti . Since every vector is a 
linear combination of the x^^, we have in par¬ 
ticular 

(1 ) AXj ■= “ij^l 

for* j •= 1 , n. The set ( of 

scalars. Indexed with the double subscript i,j, 
is the matrix of A In the coordinate system 
I ; we shall generally denote it by [A], or, 
if it becomes necessary to indicate the partic¬ 
ular basis I under consideration, by [A; I ]. 
A matrix ( ^ is usually written in the 
form of a square array: 





CVJ 

• • • 

“in 




a 

a 


OC 




21 

22 


2n 


[A] 

ss 








• • 

• • 

• • • 

• • 




OL 

oc 


a ! 



i 

L m 

n2 


nn^ 


the 

scalars 

( «!! < 

.. •, 

“in^ 

form a row. 

and 


••*» ®nj 

) a 

column 

Of 

tA]. 


The above definition does not define •’matrix"; it 
defines "the matrix associated imder certain conditions 
with a linear transformation". It is often useful to 
consider a matrix as something existing in its own rl^t 
as a square array of scalars; in general, however, a 
matrix in this book will be tied up with a linear trans¬ 
formation and a basis. Rectangular (not square) matrices 
are usually considered in connection with linear trans¬ 
formations from one vector space to another; since we do 
not treat such transformations we shall also not discuss 
such matrices* Their theory is not very different from 


kk 


II. TRANSFORMATIONS 


the theoi*y we shall develop; In particular the reader 
will find it useful, after he has read the next section 
treating the square case, to try to define the algebraic 
operations (sum, product, etc.) for rectangular matrices. 

We comment on notation. It is customary to use the 
same symbol, say A, for the matrix as for the trans¬ 
formation. The justification for this is to be found in 
the discussion below (of properties of matrices). We do 
not follow this custom here because one of our principal 
alms, in connection with matrices, is to emphasize that 
they depend on the coordinate system, (whereas the notion 
of linear transfoimatlon does not), and to study how the 
relation between matrices and operators changes as we 
pass from one coordinate system to another. 

We call attention also to a peculiarity of the in¬ 
dexing of the elements cx^- of a matrix [A]. A basis 
is a basis, and so far, although we usually Indexed its 
elements with the first n positive integers, the order 
of elements in it was entirely immaterial. It is custom¬ 
ary, however, when speaking of matrices, to refer to, say, 
the first row or first column. This language is justi¬ 
fied only if we think of the elements of the basis I as 
arranged in a definite order. Since in the majority of 
our considerations the order of the rows and colimms of 
a matrix is as Irrelevant as the order of the elements 
of a basis, we did not include this aspect of matrices 
In our definition. It is important to realize that the 
appearance of the square array associated with [A] 
varies with the ordering of I , Everything we shall 
say about matrices (with very few exceptions, occurlng 
mostly in the theory of direct products) can, according¬ 
ly, be interpreted from two different points of view; 
either in strict accordance with the letter of our def¬ 
inition, or else following a modified definition which 
makes correspond a matrix (with ordered rows and columns) 
not merely to a linear transformation and a basis, but 




§2$, DEFINITION OF MATRICES 




also to an ordering of the basis. 

(Query: is a permutation of the 

first n positive integers and ( is the matrix of 

A in the ordered coordinate system x^, what 


is the matrix of A in the coordinate system x^ ,... , 
Xj ? Compare § 20 , (3)). ^ 

^ One more word to those in the know. It is a per¬ 
versity, not of the author, but of nature that we write 
AXj = instead of the more usual formula 

Ax^ « X j ^±f^y reason is that we want the for¬ 

mulae for matrix multiplication and the application of 
matrices to numerical vectors (i.e. vectors ! ..., 

I^i in to appear normal, and somewhere in the 

process of passing from vectors to their coordinates the 
Indices turn around. To state our rule explicitly: 
write AXj as a linear combination of x^, ..., x^, and 
write the coefficients so obtained as the j-th column of 
the matrix [A]. (The first index on a^j is always the 
row index, the second one the column index.) 

For an example we consider the differentiation op¬ 
erator D in the space !p^ , and the basis X-j = x^(t) 
ss 1, Xg «= ^^(t) St, ..., x^« * t^~^. What is the 

matrix of D in this basis ? We have 


= x^ (t) 
What is the 


Dx^ « 0 = ox^ + OXg + ... + 0x^_^ + Ox^ 

DXg « 1 = lx^ + Ox^ + .•• + ox^^^ + Ox^ 

( 2 ) Dx^ « 2 t = OX^ + 2 X 2 + ... + + OX^ 

... ... ... ... 

Dx^ = (n-1 )t^"^=OX^ + OXg + ••• + (n-1)x^_^ + Ox^^, 


so that 



II. TRANSFORMATIONS 



(The unpleaaaxit phenomenon of indices turning around la 
seen hy comparing (2) and (3)) • 

§ 26 . ISOMORPHISM BETWEEN MATRICES AND OPERATORS 

There la now a certain amount of routine work to be 
done, moat of which we shall leave to the imagination. 

The problem la this: in a fixed coordinate system 
X =» (x^, ..., Xj^), knowing the matrices of A and B, 
how can we find the matrices of aA + pB, of AB, of 0, 
of 1, etc.? 

Write [A] = ( [B] =( C - aA + PB, 

[C] = ( we assert that 

(1 ) * °^®lj ^^Ij^ 

also, if [0] = [1] - then 

(2) O^j - 0 

and 

(3) (“ the Kronecker 
delta). 

A more complicated rule Is the following; If C = AB, 
tC] - ( then 

"'^Ij ”* ^k *^lk ^kj * 

To prove this we use the deflmtlon of matrices and 
j\aggle, thus: 

Cxj-A(acj) - A(Ii^ 

“ ^k ^j^k ” ^k ^j^^l“lk*l^ 

“2^ 1^ ^ k ®lk 




§ 26 , ISOMORPHISM BETWEEN MATRICES AND OPERATORS 4? 

The relation between operators and matrices Is 
exactly the same as the relation between vectors €uid 
their coordinates, and the analog of the Isomorphism 
theorem of §8 Is true In the best possible sense. We 
shall make these statements precise. 

With the aid of a fixed basis £ we have made cor¬ 
respond a matrix [A] to every linear transformation A: 
the correspondence Is described by the relations AXj « 

X (§ 25 , (1)). We assert now that this corres¬ 

pondence Is one to one. In the sense that the matrices 
of two different operators are different, and every 
array ( of n^ scalars is the matrix of some op¬ 

erator. To prove this, we observe In the first place 
that knowledge of the matrix of A completely deteimines 
A, (l.e. that Ax Is thereby imiquely defined for every 
x), according to the relations: 

(5) I - tjjj 

(6) to- Ij £jAXj- 

- Ill “ij 

(In other words If y = Ax =» X ^ then 

(7) Til jaij (y 

Compare this with § 25 , ( 2 ), and the subsequent comments 
on the perversity of Indices). In the second place, 
there Is no law against reading the relation Ax^ =* 
^l^lj^l ^ckwards: If, in other words, ( is 

array, we may write this relation as the definition of a 
linear transformation A; It Is clear that the matrix of 
A will be exactly ( (We emphasize, however, 

once more the fundamental fact that this one to one cor¬ 
respondence between operators and matrices was set up by 
means of a particular coordinate system, and that, as we 
pass from one coordinate system to another, the same 
linear transfoimatlon may correspond to several matrices, 
and one matrix may be the correspondent of many linear 


48 


II, TRANSFORMATIONS 


transformations) • 

We sum up. 

THEOREM. Among the set of all matrices 
etc., l,j = 1, ..., n, (not 
considered In relation to linear transforma¬ 
tions), we define sum, scalar multiplication, 
product, en<i 

( + ( Pj^j) - ( a 

“ ^ “ij' “ 

( of^jX p^jj - 

°lj ” 0' ®lj ” ‘^Ij- 

Then the correspondence, (established by means 
of an arbitrary coordinate system X » (x^, 
x^) of the n-dlmenslonal vector-space D ), 
between all linear transformations A on 
and all matrices ( described by 

Axj “ 51 i °^1 j^l^ Isomorphism: In other 

words It Is a one to one correspondence that 
preserves simi, scalar multiplication, product, 

0, and 1. 

We have carefully avoided discussing the matrix of 
A"\ It Is possible to give an expression for [A"M In 
terms of the elements a^j of [A], but the expression 
Is not simple and, for us, fortunately, not useful. 

527. REDUCIBILITy 

A possible relation between linear manifolds 771 In 
a vector Bj>a.oe and linear transformations A on 73 
Is that of Invariance. Jt\ Is Invariant under A If x 
In 771 Implies that Ax Is In 7)7 . (Observe that the 
Implication relation Is required In only one direction: 
we do not assimie that every y In 7)7 can be written In 



§27* REDUCIBILITY 


the form y « Ax with x In 7T^ ; we do not even assume 
that Ax in TH iniplles x in TH . See example below). 
Another locution for the same concept Is: 7tl reduces A. 
(Reduclblllty la often defined for sets of linear trans¬ 
formations as well as for a single one: Ttl reduces a set 
If It reduces each member of the set). We know that a 
linear manifold TH In a vector space Is Itself a vector 
apace: If we know that 7^ reduces A we may Ignore the 
fact that A la defined outside 7t\ and we may consider 
A as a linear transformation defined on the vector space 

m 

What can be said about the matrix of an operator A, 
on an n-dlmenslonal vector space “D , which la reduced 
by some 7n ? In other words: Is there a clever way of 
selecting a basis X = (x^, x^) In D so that [A] 

= [A; I ] will have some particularly simple form? 

The answer Is In Theorem 2 of §11 : we may choose X so 
that x^, • • • ^ ^ ^ ^+1 ^ ^ ^n 

Let us express AXj In terms of x^, x^. For 

iii+l ^ j ^ n there la not much we can say; Axj « 

Z ^ ^ i however, Xj Is In 971 , and 

therefore (since 971 la Invariant under A) AXj Is In 
971 . Consequently AXj la In this case a linear com¬ 
bination of x^, x^; the with m+l ^ 1 ^ n are 

zero. Hence the matrix [A] of A, In this coordinate 
system, will have the form 



where [A^ ] is the (m rowed) matrix of A when consid¬ 
ered as a linear transformation on the space 971 , with 
the coordinate system x^, ..., [A^] and [B^] are 

some arrays of numbers (In size (n-m) times (n-m) and 
m times (n-m) respectively), and [o] denotes the rec¬ 
tangular ((n-m) times m) array consisting of zeros only. 
(It Is Important to observe the unpleasant fact that 


II. TRANSFORMATIONS 




[Bq] need not be zero). 

For an example we may consider the differentiation 
operator D on the space and the linear manifold 

Wl spanned by the vectors l,t,t^, ...,t“, l ^ m < n. 

We leave It to the reader to verify that In this case Tfl 
is Indeed Invariant under D, but all the unpleasant pos¬ 
sibilities we have been hinting at ([B^] [o], D sing¬ 

ular In Wl , etc.) become actualities. 

§28. COMPLETE REDUCIBILITY AND DIRECT StJMS 
OP TRANSFORMATIONS 

A particularly important subcase of the notion of 
reduclblllty is that of complete reduciblllty. If TR 
and 71 are two linear manifolds such that both are In¬ 
variant \mder A and such that D Is their direct sum, 
then A Is completely reduced (decomposed) by the pair 
( 971 , 71 ). (The difference between complete J?educlbll- 

Ity and just plain reduclblllty Is that In the latter case 
among the collection of all linear manifolds Invarlemt 
\mder A we may not be able to pick out any two, other 
than 0 and B , with the property that D is their 
direct sum. Or, saying it the other way. If TR is In¬ 
variant under A, there are, to be sure, many ways of 
finding eui 71 such that B - 7 R ® 71 , but It may 

happen that'no such 71 will be Invariant under A). 

The process described above may also be turned 
around. Let TR and 71 be any two vector spaces, and 
let A €ind B be ajay two linear transformations (on 7 R 
and 71 respectively). Let 13 be the direct sum 7 R ® 71 ; 
we may define on B a linear transformation C, called 
the direct sum of A and B, defined by 

Cz = C|x,y| - |Ax,By!. 

We shall omit the detailed dlscvisslon of direct stuns of 
transformations: we shall merely mention the results. 
Their proofs are easy. If ( 7 R , 71 ) completely reduces 



§29. PROJECTIONS 


51 


C, and If we denote by A the linear transformation C 
considered on 971 alone, and by B the linear trans¬ 
formation C considered on 97 alone, then C Is the 
direct sum of A and B. By suitable choice of basis — 
namely by choosing x^, ..., In 971 and , ...,Xj^ 

In 97 — the matrix of the direct simi of A and B 

will have the form (l) of §27, with [A^] = [A], [B^] = 

[d], and [A^] = [B]. If p( t ) is any polynomial, and 
If we write A' = p(A), B' = p(B), then the direct sum, 

C, of A' and B!, will be p(C). 

§29. PROJECTIONS 

More Important for our purposes Is another connect¬ 
ion between direct sums and linear transformations. 

DEFINITION. If 13 Is the direct sum 
of 771 and 71 , so that every z In 13 may 
be written, xmlquely. In the form z = x+y, 
with X in 771 and y in 71 , we define a 
transformation E by Ez = x. E Is called 
the projection on 771 along 71 . 

Granting that direct simis are Important, projections 
are also, since, as we shall see, they are a very power¬ 
ful algebraic tool In studying the geometric concept of 
direct sum. (The reader will easily satisfy himself 
about the reason for the word "projection", by drawing a 
pair of axes (linear manifolds) In the plane (their direct 
'’um). To make the picture look general enough do not 
draw peipendicular axes J) 

We skipped over one point whose proof la easy enou^ 
to skip over, but whose existence should be recognized: 

It must be shown that E la a linear transformation. We 
leave this verification to the reader, and go on to look 
for special properties of projections. 


52 


II. TRANSFORMATIONS 


TIffiOREN 1 . A linear tranaformatlon E 
is a projection on some linear manifold W if 
and only if it ia Idempobent, l.e. •=< E. 

PROOF. If E la the projection on TR along 91 , 

and if z x+y ia the decorapoaltlon of z, with x in 

971 and y in 71 , then the deconipoaltlon of x la 
x+o, ao that 

E^z = EEz =* Ex = X = Ez. 

Converaely, auppoae that E^ = E. Let 71 he the 
aet of all vectora z in 93 for which Ez = 0 ; let 771 

he the aet of all vectora z for which Ez •• z. It la 

clear that hoth 771 and 71 are linear manifolda: we 
ahall prove that 13 = 771 ® 71 .In view of the 

theorem of §17 we need to prove that 771 and 71 are dla- 
jolnt and that together they apan 13 . 

If z la in 771, Ez = z; if z la in 71 , Ez = 0 ; 

hence if z la in hoth 771 and 71 , z ■» 0 . For an arbi¬ 

trary z we have 

z «• Ez + (l-E)z. 

If we write Ez - x and (l-E)z »• y, then Ex - E®z Ez 

« X, and Ey E( 1 -E)z - Ez-E^z = 0 , ao that x la in 

771 and y la in 71 . Thla provea that 73 - 771 ® 71 , 

and that the projection on 771 along 71 is preciaely E. 

Aa an Immediate consequence of the above proof we 
obtain also: 

THEOREM 2 . If E is the projection on 
771 along 71 , then 771 and 71 a]?e, respect¬ 

ively, the sets of all solutions of the eqm- 
tlons Ez -• z, and Ez ■■ 0. 

By means of these two theorems we can remove the ap¬ 
parent asymmetry in the definition of projections between 
the roles played by 771 and 71 . If to every z •« x+y 




§30> ALGEBRAIC COMBINATIONS OP PROJECTIONS 


55 


we make correspond not x, but y, we also get an idem- 
potent linear transfonnation. This transformation 
(namely l-E) is the projection on T( along Tf\ , Hence: 

THEOREM 3* E is a projection if and 
only if 1-E is a projection; if E is the 
projection on 7R along 71 , then i-E is 
the projection on 71 along TH . 

Query: what is the necessary and sulTicient condi¬ 
tion on YqCx), in the first example of (2), §20, in 
order that the A defined there be a projection ? 

§ 30 . ALGEBRAIC COMBINATIONS OP PROJECTIONS 

Continuing in the spirit of Theorem 5 of the pre¬ 
ceding section we Investigate conditions under which 
various algebraic combinations of projections are them¬ 
selves projections. 

THEOREM. We asstmie that E^ and E^ 
are projections on and along 71^ and 
7^2 respectively. We make three assertions. 

(i) E^ + Eg la a projection if and only 
if E^Eg = EgE^ « 0; if this condition is satis¬ 
fied then E = E^ + Eg is the projection on 

7R along 71 , where TTi = 9 TRg and 

71 = 71^ o Tig. 

(ii) E^ - Eg is a projection if and 

only if E^Eg = EgE^ “ condition 

is satisfied then F = E^ - Eg is the projec¬ 
tion on 771 along Tc , where 771 = 771^ 0 71^, 

and 71 » 71^ © TTlg. 

(ill) If « EgE^ = E then E is the 

projection on 771 along 71 where 7T1 *» 771^ ^ TTlg 
and 71 = ^2* 



2^ _ II. TRANSFORMATIONS _ 

PROOF. We recall the notation used: for linear 
manifolds ?> and K, R + ^ denotes the linear mani¬ 
fold spaimed by R and R ; writing 8 implies 

that R and R are disjoint, and then H ® R = + 

R ; and fi n R is the Intersection of ^ and R . 

(1) If + Eg = E Is s projection then 
(E^ + Eg)^ = E^ =« E = E^ + Eg, whence the cross product 
terms must disappear: 

(1 ) E^Eg + EgE, = 0. 

If we multiply (1 ) on both rl^t and left by E^ we ob¬ 
tain 

(2) E^Eg + E^EgE^ = 0, 

(3) 

subtracting we get E^Eg - EgE^ = 0. Hence E^ and Eg 
are commutative, and (1) implies that their product Is 
zero. Since, conversely, E^Eg = EgE^ = 0 clearly Im¬ 
plies (1) we see that the condition Is also sufficient to 
ensure that E be a projection. 

Let us suppose, from now on, that E Is a projec¬ 
tion; by Theorem 2 of §29, W and 7J are, respectively, 
the sets of all solutions of the equations Ez = z and 
Ez » 0. Let us write z = + y^ = Xg + where 

Xi = E^z Is In yn^ and y^ ■= (l-Ej^)z Is In 71^, 

1=1,2. If z Is In W, E^z + EgZ = z, then 

z = E, ^Xg+yg) + E 2 (x^+y, ) = E^yg + Egy^. 

Since E^(E^yg) = E^yg and EgCEgy^ ) = Egy^, we have ex¬ 
hibited z as a sum of a vector from 971^ and a vector 
from Wg, so that 7R c Conversely, If z 

is a sum of a vector from 971^ and a vector from TRg, 
then (E^ + E 2 )z = z, so that z Is In 7R, and conse¬ 
quently TR = 9R^ + Wig* Finally, If z belongs to 

both 7R^ and 7R^, E^z = EgZ = z, then z = E,z = 
E^(EgZ) = 0, so that 7R^ and TRg are disjoint, and 
7R “ 7R, © TRg. 

It remains to find R , l.e. to find all solutions 




§ 30 . ALGEBRAIC COMBINATIONS OP PROJECTIONG 


55 


of E^z + EgZ = 0. If z l3 in n this equa¬ 

tion ia clearly aatiafied; conversely E^z + E^z = o 
implies (upon multiplication on the left by E^ and E^ 
respectively) that E^z + E^E^z = o and E^E^z + E^z = o. 
Since E^EgZ =• E^E^z = 0 for all z, we obtain finally 
E^z = EgZ = 0, so that z belongs to both 71^ and 

^2* 

Using the technique and the results obtained in this 
proof, the proofs of the remaining parts of the theorem 
are easy. 

(11) According to Theorem 3 of §29, E^ " 
projection if and only if 1 - (E^ - E^) = (1 - E^) + E^ 
is a projection. According to (i) this happens (since, 
of course, 1 - E^ is a projection on Ti^ along ) 
if and only if 

(4) (1 - E, )E2 = Egd - E^ ) = 0, 

and in this case (1 - E^) + Eg is the projection on 
0 Trig along ( n Since (U) is 

equivalent to E^Eg = EgE^ = Eg, the proof of (11) is 
complete. 

(Hi) That E = E^Eg = EgE^ implies that E is a 
projection is clear, since E is idempotent. We assume, 
therefore, that E^ and Eg cosmnute and we find TO and 
!n . If Ez = z, then E^z = E^Ez = E^E^EgZ = E^EgZ = 
z, and similarly EgZ = z, so that z is contained in 
both TO^ and TO^. The converse is clear: if 
E^z = z «= EgZ, then Ez = z. Suppose next that E^EgZ = 
0; it follows that EgZ belongs to 71^, and, from the 
commutativity of E^ and Eg, that E^z belongs to 
7^. This is more symmetry than we need; since z = 

EgZ + (1 - E 2 )z, and since (l - Eg)z is in Tig, we have 
exhibited z as a sum of a vector from 71 ^ and a vec¬ 
tor from Tig. Conversely if z la such a simi then 
E^EgZ =0; this concludes the proof that 71 = 



56 


II. TRANSFORMATIONS 


We shall return to theorems of this type later and 
we shall obtain, in certain cases, more precise results. 
Before leaving the subject, however, we call attention to 
a few minor peculiarities of the theorem of this section. 
We observe first that althou^ in both (i) and (11) one 
of 771 and ^as a direct sum of two of the given 
linear manifolds, in (lii) we only stated 71 = + 

71 2 * Consideration of the possibility = E 

shows that this is unavoidable. Also: the condition of 
(ill) was asserted to be sufficient only; it is possible 
to construct projections E^ and E^ whose product 
E^E^ la a projection, but for which E^E^ and E^E^ 
are different. Finally, it may be conjectured that it 
is possible to extend, by induction, the result of (1) 
to more than two summands. Althou^ this is true it is 
surprisingly non trivial; we shall prove it later in a 
special case of Interest. 

§31. APPLICATION TO REDUCIBILITY 
AND INVOLUTIONS 

We have already seen that the study of projections 
is equivalent to the study of direct sum decompositions. 
By means of projections we may also study the notions of 
reduclblllty and complete reduclblllty. 

THEOREM 1. If a linear manifold 771 is 
invariant under the linear transformation A 
(l.e. A is reduced by 771 ) then EAE « AE 
for every projection E on TH. Conversely, 
if EAE AE is valid for some projection E 
on 771 then 7R reduces A. 

PROOF. Suppose that A is reduced by 717 and that 
T3 = 7)7 0 71 for some 71 ; let E be the projection 
on TH along 77 . Then for any z « x+y (with x in 
TH , y in 71 ) we have AEz » Ax, and EAEz « EAx; 



APPLICATION 


57 


this last term Is, however, again equal to Ax since 
X ■ being in yn ^arantees the presence of Ax in 737 • 
Conversely, suppose that 73 = 973 e 97 , and that 

EAE = AE for the projection E on 7)7 along 97 . If 
X is in 977, then Ex = x, so that 

EAx = EAEx = AEx = Ax, 
and consequently Ax is also in 977 . 

THEOREM 2. If 977 and 77 are two 
linear manifolds with 73 = 977 0 97 , 

the linear transformation A is completely 
reduced by the pair ( 777 , 77 ) if and only 

if EA «= AE, where E is the projection on 
977 along 97 . 

PROOF. We first assume EA = AE and we prove that 
A is completely reduced by ( 977 , 77 ). If x is in 

977 then Ax = AEx = EAx, so that Ax is also in 977 ; 

if X is in 97 then Ex ~ 0, and EAx = AEx = AC = o 
so that Ax is also in 77 . 

Finally we assume that A is completely reduced by 
( 977 , 77 ), and we prove that EA = AE. Since A is 
reduced by 977 , Theorem ^ assures us that EAE = AE; 
since A is also reduced by 97 , and since i - E is a 
projection on 77 , we have (7-E)A(l-E) = A(1-E), whence 
(after carrying out the indicated multiplications and 
simplifying) EAE = EA. This concludes the proof of the 
theorem. We observe that the first part of the theorem 
could also have been deduced from Theorem ^ by formal 
manipulation and it was only for the sake of variety that 
we chose our method. The Interested reader ml^t carry 
out the suggested proof. 

We conclude, for the present, our discussion of 
projections with two isolated comments of some interest. 
(A). There is an amusing connection between the 



2S 


II. TRANSFORMATIONS 


Idempotent operators just studied (l.e. those satisfy¬ 
ing the equation *» E) and Involutions (l.e. opera¬ 
tors U satisfying the equation = i). Let us 

write 

(1) U = 2E - 1, 

(2) E = ^ ( U + 1 ). 

We assert: the formulae (1) and (2) establish a one to 
one correspondence between all Idempotents E and all 
Involutions U; if U corresponds to E by (1 ) then 
E corresponds to U by (2). To prove this we must 
show that for any Iderapotent E the U defined by (1 ) 
is an involution, and, similarly, that for any involu¬ 
tion U the E defined by (2) is Idempotent. These 
verifications (which consist of squaring the right sides 
of (1) and (2) and substituting E and 1 for E^ and 
U respectively) we leave to the reader. 

(B) The matrices associated with projections in 
finite dimensional spaces have, after proper choice of 
coordinates, very simple forms. Let *0 be an n-dlmen- 
slonal vector space, and let E be the projection on a 
linear manifold 7t\ along Ti . We may choose a coordi¬ 
nate system x.|, ..., in D in such a way that 
x.|, • • • ^ Xjjj are in W and x^^.|, ..., x^ are in ^ . 
In this coordinate system the matrix [E] = ^®ij) ® 

will have the property that = 0 if 1 j, 

and e^^ « l or 0 according as in or r > m. 
Quei*y: what does this result linply about involutions ? 

§ 32 . ADJOINT OPERATORS 

Let us study next the relation between the notions 
of linear transformation and conjugate apace. Let ® 
be any vector apace and let y be any element of 
for any linear transformation A on TO we conald^ the 
expression [Ax,y]./Por each fixed y,y* «■ y’(x) *■ 
[Ax,y] la a linear functional defined bn D ; using the 



§ 32 . ADJOINT OPERATORS 


59 


square bracket notation, for y' as well as for y we 
have [Ax,y] = [x*,y']. If now we allow y to vary over 
U ' , then this procedure makes correspond to each y a 
y' depending, of course, on y; we write y' = A'y. The 
defining property of A’ is 
(1 ) [Ax,y] = [x,A'y]. 

We assert that A' is a linear transfonnation on Ti' ; 
for if y= Ogyg, then 

[x,A'y] = [Ax,y] = ot^ [Ax,y^ ] + 

= a.,[x,A'y^] + a^tXjA'yg] 

= [x, a^A'y^ + otgA'yg]. 

The linear transfonnation A* is called the adjoint of 
A; we dedicate this section and the next to studying 
properties of A'. Let us first get the formal algebraic 
rules out of the way; they are the following. 


(2) 

0* 

« 0 

(5) 

1 ' 

= 1 

(4) 

(A + B)' 

« A* + B» 

(5) 

( ctA)' 

= aA» 

(6) 

(AB)' 

*= B’A* • 

(7) 

(A"’ )' 

- (A>r’ 


Here (7) l3 to be 

interpreted in the following sense: 

if 

A has an inverse 

then A' also has an inverse and 


(7) la valid. The proofs of all-these relations are ele¬ 
mentary: to indicate the procedure we carry out the 
computations for (6) and (7). (6) is proved by the re¬ 

lations 

[ABx,y] •= [Bx,A'y] = [x,B'A’y]. 

To prove (7), suppose that A has an Inverse; then 
AA~^ = A"’a » 1. Applying to this relation (J) and (6) 
we obtain 

(A"’)'A* = A'(A"’)' - 1; 

Theorem 1 of §24 implies that (A') ^ exists and la eq\aal 
to (A"'')'. 


60 


II. TRANSFORMATIONS 


In finite dimeneional spaces (or, more properly 
speaking. In reflexive spaces) another importemt relation 
holds: 

(8) A" = A. 

This relation has to be read with a grain of salt. As it 
stands A'' Is an operator not on !D but on the conju¬ 
gate space D "of TJ'. If however, we Identify B'' and 
■Q according to the natural Isomorphism, then A'' op¬ 
erates on B and (8) makes sense. In this Interpreta¬ 
tion the proof of (8) Is trivial. Since D Is reflexive 
we obtain every linear fmctional on B’ by considering 
[x,y] as a function of y, with x fixed In B . Then 
[x,A'y] Is also a fimctlon (a linear functional) of y, 
and may therefore be written In the form [x',y]. The 
vector x' Is, by definition, A''x. Hence we have, for 
every y In B' and x in B 

(9) [Ax,y] = [x,A'y] => [A"x,y]: 

the eqmlity of the first and last terms of (9) proves (8) 

Under the hypothesis of (8) (l.e. reflexivlty) the 
asymmetry in the Interpretation of (7) may be removed: 
we assert that In this case the existence of (A')"^ Im¬ 
plies that of A”^ and, therefore, the validity of (7). 
Proof: we may apply the old interpretation of (7) to A' 
and A'' In place of A and A'. 

Our discussion Is summed up. In the reflexive finite 
dimensional case, by the assertion that A—>A' la a one 
to one algebraically anti-isomorphic mapping of the set 
of all linear operators on B onto the set of all linear 
operators on B'. (The prefix "anti-" got attached be¬ 
cause of the commutation rule (6)). 

§55. ADJOINT OF A PROJECTION 

There is one Important case In which multiplication 
does not get turned around, l.e. when (AD)' = A'B'; 
namely, the case when A and B commute. In particular 



§33 > ADJOim* OF A PROJECTION 


61 


we have (A^)* « (A*)^ and, more generally, for any 
polynomial p( t ), *(p(A))* = p(A*). It follows from this 
that If E Is a projection then so is E’. The question 
arises: what direct sum decomposition is E' associated 
with. 

THEOKM 1 • If E is the projection on 
7U along Tt , then E* is the projection 
on along . 

PROOF. That (E')^ = E' and that T)' = ® fn°, 

we have already seen. (Cf. §19). It is necessary only 
to find the linear manifolds of solutions of E*y = 0 
and E*y * y. This we do in four steps. 

(i) If y is in W ^ then for all x 

[x,E*y] = [Ex,y] = 0 

so that E*y = 0. 

(11) If E*y = 0 then for all x in m 

[x,y] = [Ex,y] = [x,E»y] = 0 

so that y is in 

(ill) If y is in then for all x 

[x,y] = [Ex,y] + [(1-E)x,y] = [Ex,y] = [x,E*y] 
so that E’y « y. 

(Iv) If E’y *= y then for all x In 7\ 

[x,y] » [x,E*y] = [Ex,y] = 0 
so that y is in Tl® . 

Steps (i) and (il) together show that the set of 
solutions of E*y =0 is precisely JTP; steps (ill) and 
(iv) together show that the set of solutions of E’y = y 
is precisely Ti^ . This concludes the proof of the 
theorem. 

THEOREM 2. If 7R reduces A then 7)^^ 
reduces A*; if A la completely reduced by 



62 


II. TRANSFORMATION S 


( 7fl , 7\ ) then A* la completely reduced 

by ( , 7\^ ) 

PROOF. We shall prove only the first statement; the 
second one clearly follows from It. We first observe the 
following identity, valid for any three operators E, F, 
and A, subject to the relation F *= 1 - E: 

(1 ) FAF - FA = EAE - AE. 

(Compare this with the proof of Theorem 2 , §31 ). Let E 
be any projection on TH ; by Theorem 1 , § 31 , the ri^t 
member of ( 1 ) vanishes, and so, therefore, does the left 
member. By taking adjointa we obtain F*A*F* » A*F*; 
since by Theorem 1 of the present section F* = E* is a 
projection on 971^, the proof of Theorem 2 is complete. 

We conclude by discussing the matrices of adjoint 
operators; this discussion Is meant to Illuminate the 
entire theory and to enable the reader to construct many 
examples. 

We shall need the following fact: If I « (x^, ..., 
x^) la any basis In the n-dlmenslonal vector apace 
and If = (y^, ..., y^) is the dual basis In H * , 
and If the matrix of the linear transformation A In the 
coordinate system I la ( then 

(2) 

This follows from the definition of matrix: since 
AXj - 51 cxj^jXj^, we have 

[AXj,y^] 

To keep things straight In the applications we rephrase 
formula ( 2 ) verbally, thus: to find the ( 1 ,j) element of 
[A] In the basis I apply A to the j-th element of I 
and then take the value of the 1 -th linear fimctional 
(In I') at the vector so obtained. 

It Is now very easy to find the matrix («j[j) •• [A'] 
In the coordinate system X'; we merely follow the 
recipe just given. In other words we consider A'yj, and 




§5^t» CHANGE OF BASIS 


63 


then take the value of the 1-th linear functional In 
I'' (l.e. of 'considered as a linear functional on 
D ' ) at this vector; the result Is that 

“ij ” 

Since [Xj^jA'yj] = [AXj,,yj] = so that =. 

this matrix [A'] Is called the transpose of [A]. 

Observe that our results on the relation between E 
and E' (where E Is a projection) could also have been 
derived by using the known facts about the matrlclal rep¬ 
resentation of a projection together with the present re¬ 
sult on the matrices of adjoint operators. 

§34. CHANGE OP BASIS 

Althou^ what we have been doing In the preceding 
sections of this chapter may 'oave been complicated It was 
to a large extent automatic: having Introduced the new 
concept of linear transfonnatlon, we merely let all the 
preceding concepts suggest ways In which they are con¬ 
nected with linear transformations. We now begin the 
proper study of the theory of linear transformations. As 
a first application of this theory we shall 3o3ve the 
problems arising from a change of basis. These problems 
can be formulated without mentioning linear transfonna- 
tlons, but their solution Is most effectively given In 
terms of linear transfonnatlons. 

Let B be an n-dlmenslonal vector space and let 
I = (x^, ..., Xj^) and V = (y,, ..., y^) be two 
bases In B . We may ask the following two questions. 

QUESTION I. Given a vector x In B , 

X = “ ^1 "^l^l^ what Is the relation 

between Its coordinates j with 

respect to I and Its coordinates 
{ p,, ..., Pjjt with respect to p ? 


64 


II. TRANSFORMATIONS 


QUESTION II. Given an ordered set of 
n scalars, I ..., what Is the re¬ 
lation between the vectors x « ^1^1 

y = ? 

Both these questions are easily answered In the 
language of linear transformations. We consider, namely, 
the linear transformation A defined by y^^ = Ax^, 

1 = 1, ..., n. More explicitly: 

Let ( a^j) be the matrix of A In the basis X , l.e. 
yj « A^. “ Xj^ °^lj^l* observe that A has an Inverse^ 
since ^ 1^1 “ ^ Implies that » ... « “CX 

Answer to question I. Since 

^ j “ij^i 

“ “ij 

we have 

(2) - ^j“lj "’j- 

Answer to question II. 

(5) y “ Ax. 

Roxighly speaking: the non singular linear trans¬ 
formation A (or, more properly, the matrix ( may 

be considered as a transformation of coordinates (as In 
(2)), or It may be considered (as we usually consider It, 
In (3)) as a transformation of vectors. 

In classical treatises on vector spaces It Is cus¬ 
tomary to treat vectors as numerical n-tuples rather than 
as abstract entitles; this leads to the necessity of In¬ 
troducing some cumbersome terminology. We give here a 
brief glossary of some of the more baffllirg terms and no¬ 
tations that arise In connection with conjugate spaces 
and adjoint transformations. 

If T3 la an n-dimensional vector space a vector x 




CHANGE OF BASIS 


A5. 


is given by Its coordinates with respect to some pre¬ 
ferred, absolute, coordinate system; these coordinates 
form an ordered set of n numbers. It Is customary to 
write this set of n numbers In a column. 


1 


1 


X = 


L 


Elements of the conjugate space D’ are written as rows, 
X* = [ 1^, ..., If we think of x as a (rectang¬ 

ular) one by n matrix, and of x* as an n by one matrix, 
then the matrix product x*x Is a one by one matrix, 
l.e. a scalar. In our notation this number Is [x,x’] « 

I ^ + ••• + The trick of considering vectors 

as thin matrices works even when we consider the full 
grown matrices of linear transformations. Thus the 
matrix product of ( with the column ( is the 

column Ij* Instead of worrying about dual 

bases and adjoint transformations we may form similarly 
the product of the row ( Ij) with the matrix ( 

In the order ( !*.)( result Is the row which w© 

earlier denoted by y* = A’x*. The form [Ax,x*] Is now 
abbreviated as x**A-x; both dots denote ordinary matrix 
multiplication. The vectors x In li are called co- 
variant and the vectors x’ In B ’ are called contra - 
variant . Since the notion of the product x’x (l.e. 
[x,x’]) depends. In this point view, on the coordinates 
of X and x* It becomes relevant to ask the following 
question: If we change basis In B In accordance with 
the non singular linear transformation A, what must we 
do In !D’ to preserve the product x*x? In our notation: 
If [x,x*] =» [y,y’] where y = Ax, then how Is y* re¬ 
lated to X*? Answer: y’ = (A*)~^x’. To express this 
whole tangle of Ideas the classical teimlnology says that 
the vectors x vary cogredlently whereas the x* vary 



66 


II. TRANSPORMATIQ] 


contragredlently . 

§35. LINEAR TRANSFORMATIONS UNDER A CHANGE OP BASIS 

Two queatlona closely related to those of the pre¬ 
ceding section are the following. 


QUESTION III. Given a linear trans¬ 
formation B on D , what la the relation 
between Its matrix ( with respect to 

I and Its matrix ( with respect to 


QUESTION IV. Given a matrix ( 

Kdiat la the relation between the linear trans¬ 
formations B and C defined, respectively. 


formations B and C defined, respectively, 
by BXj - and Cy^ - P^jy^ ? 

Questions III and IV are explicit formulations of a 
problem we raised before: to one transformation there 
correspond (In different coordinate systems) many 
matrices (question III) and to one matrix there corres¬ 
pond many transformations (question TV). 

Answer to question III. We have 

(1) BXj = 

(2) Byj " 

Using the linear transformation A defined In the pre¬ 
ceding section we may write 


Byj - Btoj - 

" ^l^^k Plk 


and 



§3$. LIMEAR TRANSFORMATIONS 


67 


^ k “^k "^kj^k 

“^k ^kj^l “ik^l 

Comparing ( 2 ), ( 3 ), and (4) we see that 

^k “ik'^kj ” ^k •^lk“kj' 

Using matrix multiplication we write this in^the danger¬ 
ously simple foim 

( 6 ) [A][C] - [B][A]. 

The danger lies in the fact that three of the four 
matrices written correspond to their operators in the 
basis X ; the fourth one, namely the one we denoted by 
[C], corresponds to B in the basis 9 . With this 
understanding, however, ( 6 ) is correct: a more usual 
form of it, adapted to computing [C] when [A] and 
[B] are known is 

(7) [C] = [A]"’[B][A]. 

Answer to question IV, To bring out the essentially 
geometric character of this question and its answer we 
observe that 

(8) Cyj = CAXj 

and 

(9) Pijyi - “ A(Z^ Plfi) - abx^ 

Hence C Is such that 

CAXj ■■ ABXj, 

or 

( 10 ) CA - AB, 

or, finally, _ ^ 

(11) c - aba” . 

There Is no trouble with (il) similar to the reservation 
we had to make about the Interpretation of (7): to find 
the operator (not matrix) C we multiply the operators 

7 • 



68 


II. TRANSFORMATIONS 


A, B, and A” ^, and nothing needs to be said about coor¬ 
dinate systems. Coinpare, however, the formulae ( 7 ) and 
(n) and observe once more the Innate perversity of math¬ 
ematical symbols. This Is merely another aspect of the 
fact expressed by the formulae (1), § 25 , and ( 7 ), § 29 ). 

There are still too many subscripts In the answer 
to question IV. The validity of (11) Is a geometric fact 
quite Independent of linearity, finite dimensionality, or 
any other property that A, B, and C may possess: the 
answer to question IV Is also the answer to a much more 
general question. This geometric question, a paraphrase 
of the analytic formulation of question IV, Is this **If 
B transforms 13 , and If C transforms AT) the same 
way, what Is the relation between B and C ?** The ex¬ 
pression "the same way" Is not as vague as It soimds; It 
means that If B take^ x Into, say, u, then C takes 
Ax Into Au. The answer Is, of course, the same as be¬ 
fore: since Bx = u and Cy = v (where y = Ax and 
V « Au), we have 

ABx » Au = V « Cy « CAx. 

The following la a convenient mnemonic diagram: 

B 



C 


We may go from y to v by using the short cut C, or 
by going around the block: In other words C « ABA"^. 
Remember that ABA”^ Is to be applied to y from rl^t 
to left: first A’^ then B, then A. 

We have seen that the theory of clianglng bases la 
coextensive with the theory of non singular linear trans¬ 
formations. A non singular linear transformation la an 
automorphism , where we mean by an automorphism an Iso- 
moiphlsm of a vector spjace with Itself. (See §8). We 
observe that conversely every automorphism la a non 
singular linear transformation. 


§ 36 . RANGE AND NULL SPACE 




We hope that the relation between linear transform¬ 
ations and matrices Is by now sufficiently clear that 
the reader will not object if in the sequel, when we widi 
to give examples of linear transformations with various 
properties, we content ourselves with writing down a 
matrix. The interpretation always to be placed on this 
procedure is that we have in mind the concrete vector 
space (5^ and the concrete basis (x^, ..., x^) defined 
by “ f ^in^* this understanding 

a matrix ( defines, of course, a unique linear 

transformation A, given by the usual formula 
A(Xi ^i^i) “ 

§ 36 . RANGE AND NULL SPACE OP A LINEAR TRANSFORMATION 

DEFINITION. If A la a linear trana- 
formatlon on a vector apace D and TO la 
any linear manifold in D , we denote by 
A TO the aet of all vectora of the form Ax 
with X In TO . The range of A la the 
aet R(A) *• AT) ; the null apace of A 
la the aet 5T (A) of all vectora x for 
which Ax - 0 . 

It la immediately verified that ATO and 71(A) 
are linear manifolda. If we denote, as uaual, by 0 
the linear manifold containing only the vector 0 , It la 
eaay to deacrlbe aome familiar concepta In terma of the 
tennlnology juat Introduced; we Hat aome of the reaulta. 

(I) A haa an Inverae If and only If 71(A) » D 
and 7) (A) - 0 . 

(II) In caae D la finite dlmenalonal, A haa an 
Inverae If and only If 7l (A) » 0 

(III) A la reduced by the linear manifold TO If 
and only If ATO c= TO 

(Iv) A Is completely reduced by the direct airni 



JO 


II. TRANSPOmTIONS 


decomposition “D « 57? ® 91 If and only If A9n c: 9P 
and A 91 71 

(v) If E Is the projection on 911 along 91 » then 
91(E) - 971 and 91(E) - 91 . 

All these statements are easy to prove: we Indicate 
the proof of (v). Prom TheoMm s, § 29 , we know that 91 
Is the set of all solutions of the eqxiatlon Ex = o: 
this coincides with our definition of 91(E). We also 
know that 971 Is the set of all solutions of the equation 
Ex «* X. If X Is in 971 then x Is also In 91(E) 
since X Is the E of something — namely of x Itself. 
Conversely If a vector x Is the E of something, say 
X «■ Ey, (so that x Is In 91(E)), then Ex ■■ E^y Ey - 
X, so that X Is In 711. 

Warning: It Is accidental that for projections 
•91 © 91 = D . In general It need not even be true 

that 91 - 91(A) and 91 = 71 (A) are disjoint. It can 
happen, for example, that for a certain vector x we 
have X 0, Ax ^ 0, and A^x «■ O; for such an x. Ax 
clearly belongs to both the range and the null space of A. 
(Concrete example: A -> differentiation on T) n ^ 1 , 

X = x(t) as t). 

THEOREM. If A la a linear transfonnatlon 

on a vector apace 93 then 

( 91(A))°= 91 (A'): 

If 9) Is finite dimensional then 

( 91(A))°- 91(A«). 

PROOF. If y la In ( 9t(A))° then for all x 

in 13 

0 - [Ax,y] - [x,A*y] 

so that A'y =■ 0 and y Is In 91 (A'). If> on the other 
hand, y Is In 71 (A') then for every x In B 




§57. RAMK AMD NULLITY 


71 


0 - tXjA'y] = [Ax,y] 

30 that y l3 in ( 3J(A))°. 

We may apply thi3 re 3 ult to A' In place of A and 
we obtain 

(1) ( R(A'))°» 71(A"). 

If 03 Is finite dimensional (and hence reflexive) we may 
replace. In (1 ), A* * hy A, and then we may attach the 
superscript o to both aides of (1), obtaining (Theorem 
2, §16) 

«(A') “ ( 71 (A))° 

§37. RAM AMD NTJLLITY 

We 3hall now restrict attention to the finite di¬ 
mensional case and draw certain easy conclusions from the 
theorem of the preceding section. 

DEFINITION. The rank . p(A), of a 
linear transformation A on a finite dimen¬ 
sional vector space U Is the dimension of 
'R (A); the nullity . v(A), is the dimension 
of 71(A). 

THEOREM 1. If A Is a linear trans¬ 
formation on an n dimensional vector space 
D , then p(A) = p(A') and v(A) <= 
n - p(A). 

PROOF. The theorem of the preceding section and 
Theorem l of §16 together Imply that 

(1) v(A') = n - p(A). 

Let I - (x^, ..., be any basis In D for which 

x^, x^ are In 71(A); then for any x =21^^ ^1*1 

we have 

Ax » Z 1 = Z. V +1 -^ 1 ^ 1 * 




72 


II. TRANSFORMATIONS 


In other words Ax Is a linear combination of the n~v 
vectors Ax^^; It follows that p(A) ^ n - 

V (A). Applying this result to A* and using (1 ) we 
obtain 

( 2 ) p(A») i n - v(A») - 9(A). 

In (2) we may replace A by A*, obtaining 

(3) P(A) = P(A»»)^ P(A»); 

(2) and (5) together show that 

(^) P(A)« p(A*), 

and (1) and (4) together show that 

(3) V (A*) = n - p(A* )• 

Replacing A by A* In (5) gives, finally, 

(6) v(A) « n - p(A), 

and concludes the proof of the theorem. 

These results are usually discussed from a little 
different point of view. Let A be a linear transforma¬ 
tion on, and X = (x^, ..., x^^) a basis in, the n-dl- 
menslonal vector space T3 ; let [A] «■ ( be the 

matrix of A In the coordinate system X , 

since If X - ^ every vector In 

IR (A) la a linear comblmtlon of the Axj, and hence of 
any maximal linearly Independent subset of the AXj. It 
follows that the maximal number of linearly Independent 
AXj Is precisely p(A). In terms of the coordinates 
{ ct^j, ttgj, ***»“nj* express this by 

saying that p(A) is the mEuclmal number of linearly 
Independent rows of the matrix [A]. Since (§35) the 
rows of tA']» (the matrix being expressed In terms of 
the dual basis of X ) are the columns of [A], It follows 
from Theorem 1 that p(A) Is also the maximal number of 
linearly Independent columns of [A]. Hence "the row 
Kmlc of [A] - the column rank of [A] - the raiik of A." 




§37. RANK AND NULLITY 


73 


THEOREM 2. If A is a linear trans¬ 
formation on the n-dimensional vector space 
I] , and if y is any h-dimensional linear 
manifold in D then the dimension of A g 
is ^ h - v(A). 

PROOF. Let R he any linear manifold for which 
T3 « Jo ^ ^ , so that for the dimension k of R we 
have k = n-h. Upon operating with A we obtain 

at = Af^ + A(^ 

(The sum is not necessarily a direct sum; see §10). 

Since All = I^(A) has dimension n - v(A), since the 

dimension of A^ is clearly ^ k = n - h, and since the 
dimension of the sum is ^ the sum of the dimensions, 
(proof ? ), we have the desired result. 

THEOREM 3. If A and B are linear 
transformations (on a finite dimensional vec¬ 
tor space) and if B is non singular then 

(7) P(AB) = p(BA) » p(A). 

In any case 

(8) p(A+ B) i p(A) + p(B), 

and 

(9) p(AB) ^ min { p(A), p(B)|, 

and 

(10) v(AB) ^ v(A) + v(B). 

PROOF. Since (AB)x = A(Bk), R(AB) la contained 
In R(A), bo that P(AB) ^ p(A), or. In other words, 

the rank of a product Is not greater than the rank of the 
first factor. Let us apply this auxiliary result to 
B*A»; this, together with what we already know, yields 
(9). If B has an Inverse then 



II. TRANSFORMATIONS 


P(A) *(HAB.B‘M i P(AB) 

and 

p(A) = p(B‘’ BA) i P(BA); 

together with (9) this yields (7). (8) Is an immediate 

consequence of an argument we have already used in the 
proof of Theorem 2. The proof of (10) we leave as an 
exercise for the reader. (Hint: apply Theorem 2 with 
« B H » !)^(B)). Together the two formulae (9) 

and (10) are known as Sylvester’s law of nullity. 

§38. LINEAR TRANSFORMATIONS OP RANK ONE 

We conclude our discussion of rank by a description 
of the matrices of transformations of rank £ l. 

THEOREM 1. If for a linear trans¬ 
formation A on a finite dimensional vector 
space D , p(A) ^ 1, (l.e. p(A) = o or 
p (A) " 1 ), then the matrix [A] » ( 
of A has the form = *^l^j ^ ©very 

coordinate system; conversely If the matrix 
of A has this form in some one coordinate 
system then p(A) ^ i. 

PROOF. If p(A) » 0, A « 0, and the statement Is 
trivial. If p(A) « 1, l.e. 5"^ (A) Is one dimensional, 
then there exists In !H(A) a non zero vector x^ (a 
basis In !H(A)) such that every vector In ^ (A) Is a 
multiple of Xq. Hence for every x 

Ax - y^x^, 

where the scalar coefficient y^ depends, of course, on 
x;yo - y^Cx). The linearity of A Implies that y^ is 
a linear functional on !D . Let now x * (x^, ...,Xj^) 
be a basis in !D , and let ( corresponding 

matrix of A, so that 




§ 38 . LIMEAR TRANSFORMATIONS OF RANK ONE 


Let X* - (y^, y^^) be the dml baele In X) * ; then 

(see formula ( 2 ), § 33 ) 

“ij = [Ax.,y^]. 

In our case 
“ij “ 

In other words we may take “ 

[Xj,yQ]. 

Conversely suppose that In the fixed coordinate 
system I * (x^, ..., x^) the matrix ( A 

is such that '^y W© find a- linear func¬ 
tional y^ s= y^Cx) for which “ [Xj,y^], and we may 

define the vector x^ = 51^ linear transform¬ 

ation X defined by Ax = y^(x) • x^ is clearly of rank 
one (imless, of course, = 0 for all 1 and j), 

and its matrix ( in the coordinate system I , is 

given by 

“ij “ 

(where X'- (y^, ..., y^^) is the dml basis of X ). 
Hence = [y^(Xj)xQ,y^] = [Xj,yQ] - Pi'Yj* and 

since A and A have the same matrices in one coordinate 
system X = A. This concludes the proof of the theorem. 

The following theorem sometimes makes it possible 
to apply Theorem l to obtain results about an arbitrary 
linear transfoniiatlon. 

THEOREM 2 . If A is a linear trans¬ 
formation, of rank P, on a finite dimension¬ 
al vector space B , then A may be written 
as the sum of p tranafomiations of rank one. 

PROOF. Since AD - R (A) has dimension p we 
may find p vectors x^, ..., x^ foimlng a basis for 
R(A), so that for every vector x in XJ we have 



76 


II. TRANSFORMATIONS 


Ax “ ^2. ^ 

where x)^ depends, of course, on x; we write nj^ “ 

yj^(x). It Is easy to see that y^^ Is a linear function¬ 
al, y^(x) « [x,y^]. In terms of these y^ we define for 
each 1=1, ..., p a linear transformation A by Ax = 
y^(x)x^. Then each A^, has rank one and A « A^. 

(Compare this result with (2), §20). A sll^t refinement 
of the proof just given yields the following result. 

THEOREM 5. Corresponding to any linear 
transformation A on a finite dimensional 
vector si)ace D there Is a non singular 
linear transformation A^ for which A^A 
Is a projection. 

PROOF. Let and 7t respectively be the range 
and the null apace of A, and let x.j, ..., Xp be a 
basis for R . Let Xp^^, ..., x^^ be such that 
x^, . •., x^ la a basis for 73 • Since for 1 »= 1, • •., 
p , x^ la In IR we may find vectors y^ such that 
Ay^ » x^; finally we choose a basis, which we may denote 
^ p+l^ ^n^ ^ assert that y^, ...^y^^ 

la a basis for 73 . We need of course prove only that 
the y’a are linearly Independent. For this purpose we 
suppose that =0; then we have (remembering 

that for 1 « p + 1, n,y^ Is In in ) 

A( X ^1^1 ** 

whence * ••• » 0. Consequently XI i« p + 1 ^ 1 ^! " 

0; the linear Independence of y^^^ shows that 

the remaining oc's must also vanish. 

A linear treuisformatlon A.j, of the kind whose 
existence we asserted. Is now determined by the condi¬ 
tions A^x^ ■■ 1 ■■ 1, •. • n. (For I** 1, p# 

A^Ay^ « A^x^ » y^, and for 1 - p+1, n, A^Ay^ » 

A^o « 0.) 




§59 ♦ DETERMINAliTS AND THE SPECTRAL TERMINOLOGY 


II 


Consideration of the adjoint of A, together with 
the reflexlvlty‘ of T3 , shows that we may also find a 
non singular Ag for which AAg Is a projection. In 
case A Itself Is non singular we must have A, = A„ = 

A-'. 

Using the results of §35 the reader may readily 
verify the following matrlcial consequence of Theorem 3 : 
to any matrix [A] there correspond two non singular 
matrices [P] and [Q] such that [P][A][Q] Is a 
diagonal matrix whose diagonal elements are all one or 
zero. 


§ 39 . DETPIMENAMTS AND THE SPECTRAL TERMINOLOGY 

It becomes necessary at this stage to go coxmter 
to the principle of giving all definitions Invarlantly 
(l.e. without using bases): we wish to say a word about 
detennlnants. At the same time we shall find It conven¬ 
ient to decide for once and for all what set of scalars 
we are going to use, and accordingly we announce that 
from here on throughout the remainder of this book, un¬ 
less we explicitly state otherwise, we shall restrict 
our attention to complex vector spaces. The only special 
property of the field of complex niambers that we shall 
use In the present chapter Is Its algebraic closure: 
every jjolynomlal equation with complex coefficients has 
a complex root, and, consequently, the number of Its 
roots, coimtlng multiplicities as usual. Is exactly 
equal to the degree of the polynomial. 

Let A be a linear treinafonnatlon on an n-dlmen- 
slonal complex vector space D , and let X » (x^, ..., 
x^) be any basis in B . We write Aj (A) for the 
determinant of the matrix [A; I ]. We assvone here a 
knowledge of the elementary properties of determinants; 
more explicitly we shall assume that the reader Is aware 
of the following simple properties: 


78 


II. TRANSFORMATIONS 


( 1 ) 

Aj ( 0 ) - 

0 , 


( 2 ) 

(1 ) = 

1 , 


(3) 

Aj(AB) - 


Aj(A) Aj(B), 

(.4) 

(A) - 


Aj (A«), 

(5) 

A Is singular If and only if A ^ (A) *■ 0 , 

( 6 ) 

Aj (A- A1 ) 

Is 

a polynomial of degree n In 


the coefficient of Is (-1 )^. 

Theee properties are not logically independent of 
each other and are not presented In the spirit of an 
axiomatic approach to the study of determinants: they 
are merely properties that we shall use. It Is true, 
however, that If an axiomatic theory of determinants did 
exist. It would start with a similar list of elementary 
properties and then prove that there Is one and only one 
function Cij, satisfying them. We unfortunately are not 
able to do this without using coordinate systems, and 
once we resign ourselves to their use the usual explicit 
combinatorial definition of the deteimlnant of a matrix 
Is completely satisfactory for our purposes. 

The most Important thing to observe about Is 

that It Is Independent of I . If, In other words, I ^ 
and Xg are any two bases In 73 then (B) - 

(B) for all B. For ( 2 ) and ( 5 ) ^Jorplj that If a 
linear transformation A has an Inverse then ^ 3 ^ (A’^) 
* 1 / (A); the formula ( 7 ), (§35), on change of basis, 

shows that 

(B) - (A“’bA) 

for a suitable non singular A. It follows from the 
multiplicative property ( 3 ) that A^^ (B) «» ^j[g(B)« 

In view of this fact we shall In the future omit the 
subscript and write A (A) for the determinant of the 
matrix of A(wlth respect to an arbitrary coordinate 
system). 



§39. DETERMINAMTS AND THE SPECTRAL TERMINOLOqr 79 


DEFINITION. If A l3 any linear 
transformation on a finite dimensional vec¬ 
tor apace, the characteristic polynomial of 
A la the polynomial A (A- Al ) In A , and 
the equation a(A- Al) => o la the character ¬ 

istic equation of A. A scalar a la a proper 
value . and a vector x 0 a proper vector , 
of A, If Ax «=« Ax. 

Almost any combination of the adjectives proper, 
latent, characteristic, eigen, secular, with the nouns 
root, nimibera, value, has been used In the literature 
for what we call a proper value. It la Important to 
realize the order of choice In the deflnltl6n: A Is a 
proper value of A If there exists a vector x o for 
which Ax •= Ax, and x ^ 0 la a proper vector of A 
If there exists a A for which Ax = Ax. Since Ax = 

A X with an. X 0 la equivalent to (A- Al ) x = o, 

and since this In turn la equivalent to A (A- Ai) = o, 
we see that a Is a proper value of A If and only If 
It la a root of the characteristic equation of A, and 
that, therefore, every A has exactly n proper values 
(counting multiplicities). The multiplicity of the iHDOt 
A of the characteristic eqiiatlon Is also called the 
multiplicity of the proper value A; In particular A la 
a a^^np^« proper value If It la a simple root of the char¬ 
acteristic equation. The set of n proper values of A, 
with multiplicities properly counted. Is the spectrum 
of A. In this language: A - 0 is a proper value of 
multiplicity n of the linear transformation o; A - 1 
Is a proper value of multiplicity n of the linear 
transfoimatlon l } the proper values of A, together with 
their multiplicities, are exactly the same as those of 
A'. We observe that If B Is any non singular trans¬ 
formation then 

A(BAB“’ - Ai)- A (B(A- Ai )B"’) - A(A-Ai) 



80 


II. TRANSFORMATIONS 


30 that the characteristic eqiiatlon, and consequently 
every other spectral concept such as the proper values 
and their multiplicities, is invariant under replacing 
A by BAB"’. 

We note also that If Ax - Ax then 
A^x = A(Ax) •»A(Ax)<= A(Ax)*» A(Ax)» A®x; 

more generally for any polynomial p( t ), p(A)x - p( A )x, 
so that every proper vector x of A, belonging to the 
proper value A, Is also a proper vector of p(A) be- 
loiaglng to the proper value p( A ). Hence If A satis¬ 
fies any eqioatlon of the form p(A) •» o then for every 
proper value A of A, p(A)"0 . Query: what can be 
said about the proi>er values of a projection and of an 
Involution ? 

SUO. MULTIPLICITIES; THE TRACE OP A LIHEAR TRANSPOra«ATION 

We call attention to another possible definition of 
multiplicity in order to point out and help avoid the 
danger of confusing the two. Suppose that A Is a proper 
value of A; let 571 be the collection of all vectors x 
which aj?e proper vectors of A belonging to this proper 
value, l.e. for which Ax - Ax. Since by our definition 
X » 0 Is not a proper vector, Jtl does not contain 0; 

If however we enlarge 57) so that It contains the origin 
then 57) becomes a linear manifold. We ml^t wish to de¬ 
fine the multiplicity of A as the dimension of this 
linear manifold. This Is a useful concept which we shall 
call the gecanetrlc multiplicity of A, to distinguish It 
from our earlier, algebraic . notion of multiplicity. It 
does not coincide with o\ir earlier definition, as the 
following exainple shows. If D is the differentiation 
operator on the space p ^ of all polynomials of degree 
^ n-1, then a vector x i- x(t) Is a proper vector of D 
if for some number A, dx/dt -• Ax. We borrow from the 
elementary theory of differential equations the fact that 



§40. MDUTIPLICITIES 


81 


every aolutlon of this equation la a conatant multiple 
of e alnce 'unleaa A ■ o, only the zero multiple of 
e la a polynomial (which It must be If It la to be¬ 
long to we muat have A » o, and x(t) b i. In 

other worda this particular operator has only one proper 
value (Hhlch muat therefore appear with multiplicity n 
In the sense of our earlier definition), namely A » 0; 
but, and this Is more disturbing, the dimension of the 
linear mEUilfold of solutions of Ax •• Ax Is exactly 
one. Hence If n > 1 the two definitions of multi¬ 
plicity give different values. 

It la quite easy to see that the geometric multi¬ 
plicity of A la always ^ Its algebraic multiplicity. 

For If A la any linear transformation, A^ any of Its 
proper values, and fn the linear manifold of solutions 
of Ax - then It la clear that W la Invariant 

under A. If we denote by A^ the linear transformation 
A considered only on TO then It la clear (by choosing 
a basis In TO, extending It to the whole space, and ex¬ 
pressing the matrix of A in the extended basis) that 
a(A q - Ai) la a factor of A(A - Al). If the di¬ 
mension of TO ( - the geometric multiplicity of A^) la 
m, then ^(Aq - ^l) - ( ^ )”*J recalling the defini¬ 

tion of algebraic multiplicity gives the desired result. 
It follows, of course, that If A^, ..., A^ are the 
distinct proper values of A, with respective geometric 
multiplicities m^, ..., nip, and If X i2i“i “ then 
m^ ■* algebraic multiplicity of A^, l-i, ..., p. 

Incidentally we are now able to show what we didn't 
quite prove before, namely that the differentiation op¬ 
erator on for n > 1, Is not completely reducible. 

If It were then It could be considered as an operator on 
the two linear manifolds TO and ^ which completely re¬ 
duce It, and hence, since we know that every operator 
has at least one proper vector. It would have a proper 
vector belonging to TO and another one belonging to ^ . 



82 


II. TRANSFORMATIONS 


Since this Is impossible, as we showed in the preceding 
paragraph, the hypothesis that It is completely reduc¬ 
ible is imtenable. 

By means of proper values and their multiplicities 
(in the sense of the algebraic definition of the preced¬ 
ing section) we can characterize two interesting func¬ 
tions of operators, one of which is the determinant and 
the other is something new. 

Let A be any linear transfonnatlon (on an n-dl- 
menslonal vector space) and let be its 

different proper values. Let us denote by m^ the 
multiplicity of Ky j«l, ..., p, so that m^ + • • • + 
iHp » n. For any polynomial eqiiatlon 

«o + a, A + ... + A^ = 0 

the product of the roots la 

of the roots la ” Since the leading coef¬ 

ficient (*= 01 ^) of the characteristic eqxiatlon 
A (A- Al ) = 0 la (-1 )^, and since the constant term 
(^a^) is a(A-O-I) « A(A), we have 

(1) ^(A)- TTjPi . 

Thl 3 characterization of the deteimlnant motivates the 
definition 

(2) T(A) - 

T(A) la called the trace of A. We shall have no oc¬ 
casion to use either of these numerical functions of 
operators in the sequel; we leave it to the interested 
reader to verify the following simple facta about T. 

If ( is the matrix of A in any coordinate 

system, then T(A) « Yi “ii^ consequently T is a 
linear function of A, T( cx.jA.j + “ a.,T(A.j ) + 

a^TCAg). Also, T(A*) - T(A), T(AB^ - T(BA), T(1 ) « n, 
and the trace of a projection is the dimension of its 
range. 



S4l . SUPER DIAGONAL FORM 


62 . 


5^1 . SUPER DIAGONAL FORM 

It Is now trivial to prove the easiest one of the 
so called canonical form theorems. 

THEOREM 1. If A Is any linear trans¬ 
formation on an n-dimensional vector space 

D then there exist n+1 linear manifolds 
DTI Q, DTl ^, ..., ^with the fol¬ 

lowing properties: 

(I) Each j = 0,1, ..., n-1, n, reduces A. 

(II) The dimension of TH. is j. 

(III) ( 0 =) c Tn^ Cl ... 

c W^(= ^ ). 

PROOF. If rt=l the statement Is trivial; we pro¬ 
ceed by Induction, assuming that the statement la cor¬ 
rect for n-1. Consider the transformation A* on D’; 
since It has at least one proper vector,-say x*, It Is 
reduced by a one dimensional linear manifold DTI — namely 
the set of all multiples of x*. Let us denote by ^^-i 
the annihllator (In *0**= D ) of 71 , "^n-i “ 
then 971 la an (n-1 ) - dimensional linear manifold 
In 71 , and reduces A. Consequently we may con¬ 
sider A as a linear transformation on alone, 

and we may find 771^, ..., *^ 11 - 2 ^ ^n-1' satis¬ 

fying the conditions (1), (11), (111). We set = 

D , and we are done. 

The chief Interest of this theorem comes from Its 
matrlclal Interpretation. Since 971^ la one dimension¬ 
al we may find In It a vector x^ ^ 0 . Since 971^ c 971^, 
x^ la also In TH^, and since 971^ la two dimensional 
we may find In It a vector x^ such that x^ and x^ 
together span Wig. We proceed In this way by Induction, 
choosing vectors Xj such that x^, ..., Xj lie In 
71\ j and span jf\^ for each j“i, . • •, n. We obtain 


II. TRANSFORMATIONS 


8U 

finally a basis ^ In *0 ; let us com¬ 

pute the matrix of A In this coordinate system. Since 
Xj Is In TTii and since 5*^^ reduces A, Axj must 
also be In Wj, and Is therefore a linear combination 
of x^, ..., Xj. Hence In the expression 

- ^l“lj*l 

the coefficient of x^^, for 1 > j, must vanish; In other 
words 1 > j Implies «• o. Hence the matrix of A 



It la clear from this representation that for l««l , ..., 
n, d(A- ) “ 0, so that the are the proper 

values of A, appearing on the main diagonal of [A] 
with the proper multiplicities. We sum up: 

THEOREM 2. Given any linear transform¬ 
ation A on an n-dlmenslonal vector space 
!D , there exists a basis 1 In D such 
that the matrix [A; I ] has the super- 
diagonal form; or, equivalently, given any 
matrix [A], there exists a non-singular 
matrix fB] such that [B]~^[A][B] Is 
superdlagonal. 

The superdlagonal form Is useful for proving many 
results about linear transformations. It follows from 
It, for example, that for any polynomial p( t ) the 




§41. SUPER DIAGONAL FORM 


85 


proper values of p(A), Including multiplicities , are 
precisely the numbers p( A ), where A runs throu^ the 
proper values of A. 


Chapter III 


ORTHOGONALiry 

§42. CONCEPT OP AN INNER PRODUCT 

Let us now get our feet back on the groimd. We 
started In Chapter I by pointing out that we wish to 
generalize certain elementary properties of certain ele¬ 
mentary spaces such as In our study so far we 

have done this, but we have entirely omitted from con¬ 
sideration one aspect of Vie have studied the 

qualitative aspect of linearity but we have entirely ig¬ 
nored the usual quantitative concepts of angle and length. 
In the present chapter we shall fill this gap: we shall 
superimpose on the vector spaces we shall study certain 
numerical functions corresponding to the ordinary 
notions of angle and length, and we shall study the new 
structure (vector si)ace + given numerical function) so 
obtained. For a clue as to how to do this we first in¬ 
spect JRg- 

If X = { 1^, /gi and y - | are any two 

points in the usual formula for the distance be¬ 

tween X and y, or the length of the se gment Joining 
X and y, is V( ^-| )^ + ( ^2^^* conven¬ 

ient to Introduce the notation 

II X « “ Vs? + ^2 

for the distance from x to the origin o -» fo,o}; In 
this notation the distance between x suod y becomes 
II x-y I. 

So much, for the present, for lengths and distances; 


86 



§42. CONCEPT OP AN INNER PRODUCT 


87 


what about angles ? It turns out that it is much more 
convenient to study, in the general case, not any of the 
usual measures of angles but rather their cosines. 
(Rou^ly speaking the reason for this is that the angle, 
in the usiial picture in the circle of radius one, is the 
length of a certain circular arc, whereas the cosine of 
the angle la the length of a line segment; the latter is 
much easier to relate to our preceding study of linear 
functions). Suppose then that we let a be the angle 
between the segment from o to x and the positive ^^ 
axis, and let ft be the angle between the segment from 
0 to y and the same axis; the angle between the two 
vectors x and y is of - p so that its cosine is 

(1 ) cos ( a - ) = cos a cos + sin ot sin P = 

II X II • II y II 

Consider the expression + ^2 nieans of It 

we can express both angle and length by very simple for¬ 
mulae. We have already seen that if we know the distance 
between 0 and x for all x then we can compute the 
distance between any x and y; we assert now that If 
for every pair of vectors x and y we are given the 
value of T)^ + fg ^2 then In terns of this value 

we may compute all distances and all angles. For if we 
take X = y then 

+ ig-Hxl^, and this takes care of lexigths; the 
fonnula (1), In turn, expresses the angle In terms of 
ll ri^ + fg ^2 ^^® lengths || x It and II y II . 

To have a concise notation let us write for x - 
( 1^, Igl and y - ( n,, hgl 

( 2 ) I, n, + ^2 \ 

what we said above Is summarized by the r elation s 
distance from 0 to x»||x||« V(x,x), 

distance from x to y - II x-y I, 


88 


III, ORTHOGONALITY 


cosine of angle between x and y * 

The Important properties of (x,y), considered as a 
numerical function of the pair of vectors x and y, 
are the following: it la symmetric in x and y, it 
depends linearly on each of its two variables, and (un¬ 
less X « 0) (x,x) is always positive. 

Observe for a moment the much more trivial picture 
in !R ^. For x = ( |^ t and y * | ^ -j ! we should 
have, in this case, (x,y) « (send it is for this 

reason that (x,y) is known as the inner product or 
scalar product of x and y). The angle between any 
two vectors la either 0 or tt, so that its cosine is 
either +i or -l. This shows up the much greater sensi¬ 
tivity of the function (x,y) which takes on all pos¬ 
sible numerical values. 

§43. GENERALIZATION TO COMPLEX SPACES 

What happens if we want to consider instead of 

5^2? The generalization seems to lie rl^t at hand: 
for X =« { ^ " f 1^2*' (where now the 

I a and n’a may be complex numbers), we define (x,y) 

«= hope that the expressions 

II X II = -\/(x,x) and || x-y|| can be used as sensible 

measures of distance. Observe, however, the following 
strange phenomenon (using i = 

II lx 11^ « (lx,ix) = l(x,lx) « l^(x,x) » -II X 11^. 

This means that if || x || is positive, l.e if x is at 
a positive distance from the origin, then lx la not — 
in fact the distance from o to lx is negative. This 
la very unpleasant: surely it is reasonable to demand 
that whatever it is that is going to play the role of 
(x,y) in this case, it should have the property that for 
x«*y it doesn’t ever become negative. A formal remedy 
again lies close at hand: suppose that we define 





S43> generalization TQ nnMPT,i;:K SPACES 




(X,y) « + I2 "^2" 

(where the bar denotes complex conjugate). In this def¬ 
inition the expression (x,y) loses much of Its fonner 
beauty: It Is no longer quite symmetric in x and y 
and It Is no longer quite linear In each of its varlablea 
But, and this Is what prompted us to give our new def¬ 
inition, 

(X,X) = + ^2 ^2 “ ' ' ^2'^ 

Is surely never negative. It is a priori diroious 
whether a useful and elegant theory may be befit up un 
the basis of a function which falls to possess so many 
of the properties that recommended It to attention in 
the first place; that It is so will be shown in tlie 
sequel. A cheerful portent Is this. Consider the space 
<^^(l.e. the set of all complex numbers). It is linpos- 
slble to draw a picture of any configuration in this 
space and then to be able to tell it apart from a con¬ 
figuration In but conceptually it is clearly a 

different space. The analog of (x,y) in tliis space, 
for X « (1^1 and y = [ i, is (x,y) = 
this expression does have a simple geometric interpreta¬ 
tion. If we join X and y to the origin by straight 
line segments, (x,y) will not, to be sure, be the cosine 
of the angle between the two segments: it turns out 
that for II X II = II y II = 1 its real part is exactly 
'shls cosine. 

The complex conjugates that we were forced to Intro¬ 
duce here will come back and plague us later: for the 
present we leave this heuristic introduction and turn to 
the formal work, after just one more comment on the no¬ 
tation. The similarity of the symbols ( , ) and 
[ , ], the one used here for Inner product and the 
other used earlier for linear functionals, is not acci¬ 
dental. We shall show later that It is. In fact, only 
the presence of the complex conjugablon In ( , ) that 



iO 


III. ORTHOGONALITY 


makes it necessary to use for It a symbol different from 
[ , ], For the present, however, we cannot afford the 

luxury of confusing the two. 

§44. FORMAL DEFINITION OF UNITARY SPACE 

DEFINITION. An Inner product In a (real 
or complex) vector space *0 Is a (respective¬ 
ly, real or complex) numerically valued fmc- 
tlon of the ordered pair of vectors x and y 
such that 

(1) (x,y) - 

( 2 ) + 

(3) (x,x) ^ 0; (x,x) - 0 is equivalent 

to X - 0. 

(In the case of a real vector space the conju¬ 
gation In (1) may, of course, be Ignored). A 
\jnltary apace la a vector apace In which an 
Inner product Is defined; In a unita ry apace 
we shall use the notation + V(XyX) = 

II X J ; II X II la called the norm or length of x. 

As examples of unitary spaces we may consider 
!R^, and 'p ; in the first two cases we define, for 
X** { i ^ f ••• 9 and y* • • • f If ( x , y) 

■« In P we define, for x » x(t) and y « 

y(t), (x,y) - 5 V(t)y (t)dt. 

O 

In a unitary ax>ace we have 
(2') (X, fi,y, + Pgyg) - (IV^ypnS^ygT^ 

- p,(y, ,x) + -1^(x,y,) + 

This fact, together with the defijoition of ixuier product, 
explains the terminology sometimes used to describe 
properties (l), (a), (3) (and their consequence (a')): 
(x,y) is a Hermitian symmetric (1), conjugate bilinear 



§4$. APPLICATIONS OF SCHWARZ'S INEQUALITY 


1 


((2) and (2'))» positive definite (3) fom. We observe 
that these properties of (x,y) Imply for || x || the 
homogeneity property 

(4) llax 11.0. I a I . J X II. 

(Proof: II ax 11^ - ( ax, ax) - aa(x,x) » I a |^ || x 11^).. 

THEOREM. (Schwarz's Inequality ). For 
any x and y |(x,y)| ^ || x || • II y II ; 
equality occurs if and only if x and y 
are linearly dependent. 

PROOF. We write a = I y 1^, (3«= (x,y). Then 

0 ^ II ax- (Jyll® = ( o X- (iy, ax- (3y) = ( ax, otx) - 
( ax, (3y) - ( (3y, ax) + ((3y, (3y) 

= II ax 11^ - [a(5(x,y) + af3(x,y)] 

+ II 3y II ^ 

- I a|^ II X 11^ - 2 R[aP(x,y)I 

+ I (3 I ^ II y n ® 

= II y 11** II X 11® - 2 II y I® l(x,y)|® 

+ I (x,y)l® II y I® 

«=■ II y II ^( II X II® II y II® - I (x;y)|®). 

(We use RC to denote the real part of the complex 
number t; ; if C ” + i'*’ with real cr and t then 

R<; ■> O’ , and the Imaginary part of C t - It; ). 

If y 3 * 0 'the inequality follows by dividing out 
II y II®; if y = 0 we clearly have eq\iallty. More gen¬ 
erally since the only place an Inequality came in the 
above computation was in the assertion ||ax - (3yl|® ^ o, 
it is clear that the vanishing of the last tern implies 
the linear dependence of x and y. The reader may 
easily verify the converse. 

545 . APPLICATIONS OF SCHWARZ'S INEQUALITY 

The Schwarz inequality has Important arithmetic, 
geometric, and analytic consequences. 





92 


III. ORTHOGONALITY 


(1 ) In any unitary space we define the distance 
6(x,y) between two vectors x and y by 
<5(x,y) = II x-y II - (x-y, x-y). 

In order for tf to deserve to be called a distance It 
should have the following three properties: 

(1) c5(x,y) = c5(y,x); 

(11) 6(x,y) ^ 0; 6(x,y) = 0 Is equivalent to x«y; 

(ill) 6(x,j) i c5(x,z) + 6(z,y). 

(In a vector space It is also pleasant to be sure that 
distance Is Invariant under translations: 

(iv) 6(x,y) = 6(x+z, y+z).) 

Properties (1), (11), and (Iv) are obviously possessed by 
the particular 6 we defined; the only question is con¬ 
cerning the **trlangle Inequality** (111). To prove the 
validity of (111) we observe that 

II x+y 11^ « (x+y, x+y) = || x II ^ + 2 R(x,y) + II y II ^ 

^ II X ||2 + 2| (x,y)| + II y 11^ 

^|xi^ + 2 ||x| • Byn +l|yll^ 

= (I X I + B y I 

replacing x by x-z and y by z-y we obtain 
II x-y II ^ II x-z II + II x-y II, 

o,nd this is equivalent to (iil). 

( 2 ) In the apace !R_, -- Is the cosine 

^ II X II . II y II 

of the angle between x and y. The Schwarz Inequality 
in this case merely amounts to the statement that the 
cosine of a real angle Is ^ i. 

(3) In the space C! ^ the Schwarz Inequality be¬ 

comes the so-called Cauchy Inequality: for any two sets 
j and j rij^i of complex .numbers 

we have 

1^1-. ^1^. I I ’’ll'- 

(U) In the apace !p the Schwarz Ineqviallty becomes 
I SoX(t)y(t) dtl^^ |x(t)l^ dt • |y(t)|^dt. 




§46, ORTHQGfONALITY 


93 


It is useful to observe that the relations mentioned 
In (l)-(4) above are not only analogous to the general 
Schwarz Inequality, but actually consequences or special 
cases of It. 

(5) We mention In passing that there Is room be¬ 
tween the two notions (general vector spaces and unitary 
spaces) for an Intermediate concept that has been studied 
extensively In recent years. This concept Is that of a 
normed vector space, l.e. a vector space In which there 
Is an acceptable definition of length, but nothing Is 
said about angles. A norm in a vector apace Is a niLmer- 
ically valued function || x || of the vectors x such 
that II X II y 0 unless x = o, !| a x || = I a I l| x || , and 
II x+y II ^ I X II + II y II . Our discussion so far shows 
that a unitary apace Is a normed vector space; the con¬ 
verse Is not In general true. In other words If all we 
are given la || x || satisfying the three conditions 
just given. It may not be possible to find an Inner 
product (x,y) for which (x,x) = I1 x II The norm In 
a unitary space has an essentially quadratic character; 

It can be shown for example that a necessary and suffic¬ 
ient condition for the existence of an Inner product 
giving rise to a preaasigned norm II x || Is the general 
validity of the relation || x+y 11^+11 x-y II ^ = 2(||x || ^ 

+ II y 11^). (Compare formula (4), §59.) 

§46. ORTHOGONALITY 

The most Important relation between vectors of a 
unitary space Is orthogonality. 

DEFINITION. X and y are orthogonal 
If (x,y) « 0. (Observe that the relation Is 
symmetric; since (x,y) = (y,x), (x,y) and 
(y,x) vanish together). 


If we recall the motivation for the Introduction of 


III. ORTHOGONALITY 


SlL 

(x,y), the terminology explains itself: the two vectors 
are orthogonal (or perpendicular) if the cosine of the 
angle between them is o, so that the angle between them 
is 90 °. 

Two linear manifolds are orthogonal if every vector 
in each is orthogonal to every vector in the other. A 
set I of vectors is an orthonormal set if for every 
X and y both in I we have (x,y) = 0 or (x,y) - 1 
according as x ^ y or x = y. (If I is finite, 

I •= (x^, ...» x^), we have (x^^, Xj) ” 

To make our last definition in this connection we 
first observe that an orthonormal set la linearly inde¬ 
pendent . For if x^, ..., Xj^ la any finite subset of 
the orthonormal set 1 , then “i*i “ ° Implies 

0 - ( I dij -a 

in other words a linear combination of x'a can vanish 
only if all the coefficients vanish. Hence: in a finite 
dimensional vuiltary space the number of vectors in an 
orthonormal set is always finite and, in fact, not 
greater than the linear dimension of the apace. We de¬ 
fine, in this case, the orthogonal dimension of the space, 
as the largest nimiber of vectors an orthonoimal set can 
contain. We call an orthonormal set complete if it is 
not contained in any larger orthonoimal set. 

(Warning: For all we know at this stage the con¬ 
cepts of orthogonality and orthonormal sets are vacuous. 
It is easy to see, however, that if the space contains 
a vector x ^ 0 , then we can always find orthogonal 
vectors and orthonormal sets; for example x and 0 are 
orthogonal, and the set consisting of ^/| x II alone is 
an orthonoznal set. We grant that the example of orth¬ 
ogonal vectors we Just gave is not much more inspiring 
than the exanple x - 0 , y -• 0 , but we shall show 
presently that there are always "enough" orthogonal vec¬ 
tors to operate with in comfort. Observe also that we 




CHARACTERIZATIONS OP COMPLEfTENESS 


95 


have no right to assimie that the number of elements in 
a complete orthoriormal set la the orthogonal dimension. 
The point la this: If we had an orthonormal set with 
that many elements. It would clearly be complete; but It 
la conceivable that some other set contains fewer ele¬ 
ments, but la atm complete because Its nasty structure 
precludes the possibility of extending It. These dif¬ 
ficulties are purely verbal and will evaporate the moment 
we start proving things: they occur only because from 
among the several possibilities for the definition of 
completeness we had to choose a definite one and we must 
prove Its equivalence with the others). 

We need some notation. If (C Is any set of vectors 
of a unitary apace T3 , we denote by (L^ the set of all 
vectors orthogonal to every vector In (C ; It la clear 
that C'*’ Is a linear manifold (whether or not ^ Is one), 
and that Is contained In )“'’ . It fol¬ 

lows that the linear manifold spanned by (L Is contained 
In In case (L Is a linear manifold we shall call 

the orthogonal complement of Ct . We use the sign 
M In order to be reminded of orthogonality (or per¬ 
pendicularity). ml^t be pronounced as **C perp.** 

§ 47 . CHARACTERIZATIONS OP COMPLETENESS 

^THEOREM 1. ( Bessel*a Inecaiallty ). If 
X « (x^, ..., x^) Is any finite ortho- 
normal set In a unitary ai>ace, and x la any 
vector, and If we write “ (x,x^), then 

Z i I i II X A®. 

Moreover x' - x - ctjXj^ ia orthogonal to each 
Xj and consequently to the linear manif old 
spanned by 3E: . 


PROOF. We have 



6 


III. ORTHOGONALITY 


0 ^ I X' = (x',x') - (x-Xi “i^i* 

= « X |2 -Zii - Z^I 
= I X -Zil cril^ 
and 

. (x,xj) - Zi«iUi,Xj) - Cj- Cj . 0 . 

THEOREM 2. Let I « (x^, x^) be 

any finite orthononnal aet In a unitary apace 
p ; the following a lx condltlona on I are 
equivalent to each other. 

(1) I la complete. 

(2) (x,x^) « 0 for 1=1, n Impllea 

X « 0. 

(5) The linear manifold apanned by X 
la the whole apace 13 . 

(4) For every x In H , x « Xj^(x,x^)x^. 

(5) For every pair x,y In 13 , 

(x,y) =■ Zi(x,Xj^)(Xj^,y). 

( Paraeval*a Identity ). 

(6) For every x In B , II x || ^ 

Z j^l (x»Xj^) I • 

PROOF. We shall establish the Implications (1) 

(2) (3) (*^) (5) ==^ (6) (1 ). Thus we first 

assume (l) and prove (a), then assume (2) to prove (3)# 
and so on until finally we prove (1) assuming (6). 

(1) =#( 2 ). If (x,Xj^) ■■ 0 for all 1 and x ^ o 
then we may adjoin ^/H x H to 3C and thus obtain an 
orthonormal set larger than X. 

(2) =* (3). If there Is an x which la not a linear 
combination of the x^^ then, by the last part of Theorem 
1, x' » X - Zi(x,Xj^)Xj^ la different from o and la 


§48. EXISTENCE OF COMPLETE ORTHONORMAL SETS 


orthogonal to each 

(3) (**). " We know that every x has the form 

X » it follows that (x,x^) = Xj o[j(Xj,Xj^) = a. 

(4) =>( 5 ). We have x ■= 7 = Hj 

with = (x,Xj^), Pij = (y,Xj). It follows that 

(x,7) - (^Ii - Zjlj “i i5j(*i*j) - 

2. 1 Oj 

(5) =^(6). Set X = y. 

(6) ==^ (f). If I were contained In a larger 

orthonormal set, say if Is orthogonal to each x^, 
then II Xq 11^ = 21 II(Xq,x^)||^ = 0 so that x^ = 0. 

§48. EXISTENCE OP COMPLETE ORTHONORMAL SETS 

THEOREM. Let 03 be an n-dimensional 
unitary space. Then complete orthonormal sets 
in 13 exist and every complete orthonormal 
set contains exactly n elements, so that the 
orthogonal dimension of 13 is the same as its 
linear dimension. 

PROOF. To people not fussy about hunting for an 
element of a possibly uncountable set, the existence is 
obvious. We have already seen that orthonormal sets 
exist, so we choose one; if it is not complete we may 
increase it, and if the resulting orthonormal set is 
still not complete we increase it again, and we proceed 
in this way by induction. Since an orthonormal set may 
contain at most n elements, in at most n steps we 
will have reached a complete orthonormal set. This set 
spans the whole space (§4?, Theorem 2, (i) => (3))y and 
since it is also linearly independent, it is a basis and 
hence contains precisely n elements. 

There is a constructive method of avoiding this 
crude induction, and since it sheds further llgiht on the 
notions involved we reproduce it here as an alternative 


98 


III. ORTHOGONALITY 


proof of the theorem. 

The Gram-Schmidt orthogonallzatlon process . Let 
£ « (x^, ..., x^) he any basis in D . We shall 
construct a complete orthonormal set y = (y^, ..., y^) 
with the property that each y^ is a linear combination 
of x^, ..., Xj. To do this for j « i, we need only to 
observe that x- 7 ^ 0 (since I is linearly Independent); 
we write y^ « ^l/!| x^ II . Suppose then that y^, • •-Yp 
have been found so that they form an orthonormal set and 
so that each y^C j = 1 , ..., r) Is a linear combination 
of x^, ..., Xj. We write z « x^^^ - ( tt^y^ + ••• + 
“r^r^' where , ...,a^ are any scalars, and we ob¬ 
serve that for j = 1 , ..., r 

(z,yj) - -Ziaiyi, yj) = 

SO that If we choose oc. « then - 0 

for j « 1, r. Since, moreover, z is a linear 

combination of x^^^ and y^, ..., y^. It Is also a 
linear combination of x^^^ and x^, ..., x^. Finally 
z Is different from zero, since x^, ..., are 

linearly Independent and the coefficient of in 

the expression for z is not zero. We write y^,^^ = 
z/ll z II; clearly y^, ..., y^, y^,^^ Is again an ortho¬ 
normal set with all the desired properties, and the In¬ 
duction step Is established. We shall make use of the 
fact that not only Is y^ a linear combination of the 
first j x*s but conversely each x. Is a linear com¬ 
bination of the first j y*s. 

We shall find It convenient and natural. In unitary 
spaces, to work exclusively with such bases as are also 
complete orthonormal sets. We shall call such a basis 
an orthogonal basis or an orthogonal coordinate system ; 
whenever we shall discuss not necessarily orthogonal 
bases we shall emphasize this fact by calling them 
linear bases. 



§49. PROJECTION THEOREM 


99 


§ 49 . PROJECTION THEOREM 

Since a linear manifold in a xmltary space may 
itself be considered as a vinltary space, the theorem of 
the preceding section may he applied. As the most im¬ 
portant application of this fact we have the following 
projection theorem. 

THEOREM. If DTI is any linear manifold 
in a finite dimensional \mltary space X) , 
then B Is the direct sum of 971 and DB"^, 
and DTT^^= W. 

PROOF. Let £ « (x^, ... ^ be aai orthonormal 

set which la complete In TH , and let z be any vector 
In D . We write x « 51 where = (z,x^); It 

follows from § 47 , Theorem 1, that y z - x Is In 7f\^ , 

so that z may be written as a simi of two vectors, z 
» X + y, with X In TR and y In That Wl and 

7n^ are disjoint Is clear: If x belonged to both 
then we should have 1 x || ^ « (x,x) « 0. It follows from 
the theorem of §17 that "D = 7R ® 

We observe that In the decomposition z « x + y, we 
have (z,x) « (x+y,x) = J x + (y^x) « || x II and 
similarly (z,y) « II yll^. Hence, If z Is In 
so that (z,y) = 0, then II y 11^ « 0 , so that z = x Is 
In TH : In other words 7)1 *^'^13 contained In 7)1 . 

Since we already know that 7)1 Is contained In 971*^*^ , 
the proof of the theorem Is complete. 

This kind of direct sum decomposition of a unitary 
space (l.e. by a manifold and Its orthogonal complement) 
Is of considerable geometric Interest. We shall study 
a little later the associated projections: they turn out 
to be an Interesting and Important subclass of the class 
of all projections. At present we remark only on the 
connection with the Pythagorean theorem: since (z,x) « 

II X 11^ and (z,y) = || y 11^, we have 



100 


III. ORTHOGONALITY 


II z II ^ « (z,z) = (z,x) + (z,y) = II X II ^ + II y II 

In other worcLa the square of the hypotenuse is the sum 
of the squares of the sides. More generally if 07?^, 

Tfl^ is any collection of pairwise orthogonal lin¬ 
ear manifolds in a unitary space D , and if x = 
x^ + ••• + x^, with X. in TH. for j = k, 

then 

I! X 1!^ = II X, II2 + ... + II Xj^ ||2. 

§ 50 . REPRESENTATION OP LINEAR FUNCTIONALS 

We are now in a position to study linear function¬ 
als on unitary spaces. For a general n-dlmensional 
vector space, the conjugate space is also n-dimenslonal 
and is therefore Isomorphic to the original space. There 
is, however, no obvious natural isomorphism that we can 
set up — we have to wait for the second conjugate space 
to get back where we came from. The main point of the 
theorem we shall prove is that in unitary spaces there 
is a "natural” correspondence between T3 and H ^ : the 
only cloud on the horizon is that it is not quite an iso¬ 
morphism. 


THEOREM. To any linear functional 
y* * y*(x) on a finite dimensional imltary 
space T there corresponds a unique vector 
y in D such that, for all x, y*(x) « ^x,y). 

PROOF. If y* • 0 we may choose y ~ O; let us 
from now on assume that y’ is not identically zero. 

Let TH be the linear manifold of all vectors x for 
which y*(x) * 0, and let 7\ « be the orthogonal 
complement of 7R. Then 91 contains a vector Yq 7 *^ o; 
by multiplication with a suitable cons tent we may assume 
II Yq II * 1 • We write y » Y*(Yo)*Yq (where the bar de¬ 
notes, as usual, complex conjugation); we do then have 



S^O, REPRESENTATION OF LINEAR FUNCTIONALS 


101 


the desired relation 
(1 } y*(x) » (x,y) 

at least for x = y^ and for all x in 971 . For an 
arbitrary x In 90 we write x^ = x- Ay^ where 



then y*(x^) = 0 and ^ ^^o com¬ 

bination of two vectors for each of which (1 ) is valid. 
From the linearity of both sides of (1 ) It follows that 
(1 ) holds for X, as was to be proved. 

To prove uniqueness suppose that (x,y^) ^Cx^y^) 
for all x; then (x,y^-y 2 ) * 0 for all x and there¬ 
fore In particular for x = y^-yg^ so that lly^-yg 11^ - 0, 
and y^ “ yg- 

The correspondence y* y Is a one to one cor¬ 
respondence between 73 and 73* , with the property that 
to yj+y^ there corresponds y-j+y^ an<i to ay* there 
corresponds ay: for this reason we refer to it as a 
con.jugate Isomorphism . In spite of the fact that this 
conjugate Isomorphism makes TQ * practically indisting¬ 
uishable from 90 , it is wise to keep the two conceptual¬ 

ly separate. One reason for this la that we should like 
73 ’ to be a unitary apace along with 73 ; if, however, 
we follow the clue given by the conjugate isomorphism be¬ 
tween 90 and 73* the conjugation again causes trouble. 
Let y,* and y^ be any two elements of 90 ; if yj(x) 

« (x,y^) and y^(x) - the temptation la great 

to define 

(yi^yg)* 

A moment'" consideration will show that this expression 
does not satisfy (a) (§44) eind la not therefore a suit¬ 
able Inner product: thus we have 

( cxy,',y|) - (ay^.yg) - 5(y,,y2) - «tyi'>yp- 
The remedy is clear; we define 



102 


III. ORTHOGOMALITY 


(7^f7p ” (y^/yg) “ (yg>yi ); 
we leave It to the reader to verify that with this def¬ 
inition D* becomes a unitary space. We shall denote 
this unitary space by U**! 

551. RELATION BETWEEN PARENTHESES AND BRACKETS 

It becomes necessary now to stral^ten out the re¬ 
lation between general vector spaces and unitary spaces. 
The theorem of the preceding section shows that, as 
long as we are careful about complex conjugation, {x,y) 
can completely take the place of [x,y]. It ml^t seem 
that It would have been desirable to develop the entire 
subject of general vector sj^ces In such a way that the 
concept of orthogonality In a complex unitary becomes 
not merely an analog but a special case of some pre¬ 
viously studied general relation between vectors and 
functionals. One way, for exeunple, of avoiding the un¬ 
pleasantness of conjugation (or, rather, of shifting It 
to a less conspicuous position) would have been to define 
the conjugate space of a complex vector space as the set 
of conjugate linear functionals, l.e. the set of numeri¬ 
cal valued functions y(x) for which 

y( + a^Xg) - 5^y(x,) + OgyCXg). 

Because It seemed pointless (and contrary to common 
usage) to Introduce this complication Into the general 
theory we chose Instead the roundabout way that we just 
travelled. Since from now on we shall deal with unitary 
spaces only, we ask the reeuler mentally to revise all 
the preceding work by replacing, throughout, the bracket 
[x,y] by the parenthesis (x,y). Let us examine the 
effect of this change on the theorems and definitions of 
the first two chapters. 

Replacing C by 0* Is merely a change of notation: 
the new symbol Is supposed to reiQlnd us that something 


§51 , RELATION BETWEEN PARENTHESES AND BRACKETS 103 


new (namely an inner product) has been added to !D* . 

Of a little more Iritereet Is the (conjugate) isomorphism 
between D and C*: by means of it the theorems of 
§14, asserting the existence of linear functionals with 
various properties, may now be interpreted as asserting 
the existence of certain vectors in 13 itself. Thus, 
for example, the existence of a dual basis to any given 
basis I = (x^, x^) implies now the existence of 

a basis g * (y^, y^) (of 13 ) with the property 

that ~ Query: what does it mean for a 

basis to be self - dual > i.e. x^ = y^, 1 = i, ... n? 

More exciting still is the implied replacement of 
the annihllator 771^ of a linear manifold TO , ( 771 ^ ly¬ 
ing in D * or 13 * ) by the orthogonal complement TO'^ 
(lying, along with TO , in TQ ). The most radical new 
development, however, concerns the adjoint of a linear 
transformation. Thus we may write the analog of (i ) 
(§52) and corresponding to every linear transformation 
A on 13 we may define a linear transformation A* by the 
relation 

(1 ) (Ax,y) » (x,A*y). 

A* is again a linear transformation defined on the same 
vector space 13 , but because of the Hermitian symmetry 
of (x,y) the relation between' A and A* is not 
quite the same as the relation between A and A*: the 
most notable difference is that (a A)* = a A* 

(and not * a A*). Associated with this phenomenon is 
the fact that if the matrix of A, with respect tV) some 
fixed basis, is ( then the matrix of A*, with 

respect to the dual basis, is not ( cxj^) but ( oT^); 
also for determinants we do not have a(A*) = A (A) 
but A (A*) - a(A), and, consequently, the proper 

values of A* are not the same as those of A, but 
rather their conjugates. Here, however, the differences 
stop. All the other results of §52 on the anti-isomor¬ 
phic nature of the correspondence A A* are valid; 



104 


III. ORTHOGfONALITY 


the identity A « A** la strictly true and does not need 
the help of an laomorxdilani to Intei^pret It. 

Presently we shall discuss linear transformations 
on unitary spaces and we shall see that the principal new 
feature differentiating their study from the discussion 
of Chapter II Is the possibility of comparing A and A* 
as linear transformations on the same space, and of In¬ 
vestigating those classes of linear transformations which 
bear a particularly simple relation to their adjolnts. 

§52. COMPARISON OP THE "NATURAL" ISOMORPHISMS 
FRm T3 TO T3** 

There Is now only one more possible doubt that the 
reader ml^t (or, at any rate, should) have. Many of 
our preceding results were consequences of such reflex- 
Ivlty relations as A** » A; do these remain valid after 
the brackets to parentheses revolution? More to the 
point Is the following way of asking the question: every¬ 
thing we say about the unitary space T3 must also be 
true about the unitary space D*; In particular It Is 
also in a natural conjugate Isomorphic relation with Its 
conjugate space 13 **. If now to every vector In D we 
make correspond a vector In H**, by first applying the 
natural conjugate Isomorphism from D to 13 * and then 
going the same way from D * to 13**, then this map¬ 

ping la a rival for the title of natural mapping from 
C to 13 ** — a title already awarded In Chapter I to 
a seemingly different correspondence. What la the re¬ 
lation between the two natural correspondences? Our 
statements about the coincidence, except for trivial 
modifications, of the parenthesis and bracket theories, 
are really justified by the fact, which we shall now 
show, that the two mappings are the same. (It should 
not be surprising, since o - a , that after two applica¬ 
tions the bothersome conjugation disappears). The proof 
is shorter than the Introduction to It. 




LIMEAR TRANSFORMATIONS ON A I3NITARY SPACE 


10 


Let be any element of T) ; to it there corree- 

jjonda the linear functional y* = yj(x) = in D *, 

and to y^* In turn there corresponds the linear fimc- 
tlonal y** = y**(y*) = (y,y*) In D **. Both these 
correspondences are given by the mapping Introduced In 
this chapter. Previously (see § 15 ) the correspondent 
y** In B** of y^ In B was defined by y**(y*) “ 
y*(yQ) for all y* In B*: we must show that 7q*> 
as we here defined it, satisfies this relation. Let 
y* “ y*(x) (x,y) be any linear functional on B , 

(l.e. any element of B *); we have 

y**(y*) - (y*,yo*) = (yo,y) = y*(yo). 

(The middle equality cornea from the definition of inner 
product in 'D*). This settles all our p 2 ?oblema. 

A word about direct sums. We may define the direct 
sum of two imltary spaces U and 13 just as we defined 
(in § 17 ) the direct sum of any two vector spaces: it la 
only necessary to say something about the inner product. 
The obvious solution works in this case: we define 
(using the notation of § 17 ) 

(!x,,y^! \x^,7^]) = + (y^^yg)- 

There is not much more to be said. We should prove the 
analog of the theorem of § 17 > l.e. we should be able to 
decide when a unitary space may be considered as the 
direct sum of two of Its subspaces. We leave it to the 
reader to verify that a necessary and sufficient condi¬ 
tion is that the two linear manifolds be orthogonal com¬ 
plements of each other. 

§ 53 . LINEAR TRANSFORMATIONS ON A UNITARY SPACE 

Let us now study the algebraic structure of the 
class of all linear transformations on a \mltary space 
B .In many fundamental respects this class resembles 
the class of all complex numbers. In both systems the 



106 


III. ORTHOGOMALiry 


notions of addition, imiltlpllcation, o, and l are de¬ 
fined and have similar properties, and In both systems 
there Is an involutory (anti-) automorphism of the system 
on Itself — namely A —* A* and C, —> ^ .We shall 
use this analogy as a heuristic principle and we shall 
attempt to carry over to linear transfonnatlons some well 
known concepts of the complex domain. We will be 
hindered in this work by two properties of linear trans¬ 
formations, of which, possibly surprisingly, the second 
la much more serious: the Impossibility of unrestricted 
division and the non commutativity of general linear 
transformations. 

First we need an aiuclllary result. 

THEOREM. If A is a linear transform¬ 
ation on a complex unitary space U then the 
vanishing, identically, of either of the two 
expressions (Ax,x) and (Ax,y) Is necessary 
and sufficient for A to be zero. 

PROOF. That either condition Is necessary Is clear, 
as also la the fact that the second condition Is suf¬ 
ficient. (If (Ax,y) = 0 for all x and y then 
choose y = Ax; It follows that Ax = o for all x). 

In fact this sufficiency has its analog In pure vector 
space theory: [Ax,y] vanishes Identically if and only If 
A = 0. (If [Ax,y] = 0 for all y, then Ax = o for 
each X, and hence A = o). It is the sufficiency of the 
condition (Ax,x) =■ o that Is really peculiar to complex 
unitary spaces. 

For the proof of this sufficiency we use the so- 
called polarization Identity: 

(1) aj5(Ax,y) + a(i(Ay,x) - (A( ax+(iy), (ax+(iy)) - 

I o(l^(Ax,x) - I PI^(Ay,y). 

(We leave to the reader the simple verification carried 
out by expanding the first tenn on the rl^t). If (Ax,x) 



, HERMTTIAN TRANSFORMATIONS 


107 


la Identically zero then we obtain, first choosing 
oc » p « 1, and then a « 1 « (3 « 1, 

(Ax,y) + (Ay,x) = o 
l(AJ!:,y) - l(Ay,x) = 0. 

Dividing the second of these two equations by 1 and 
then forming their arithmetic mean we see that (Ax,y) « 

0 for all X and y, so that, by the easier half of 
our theorem, A » o. 

This process of polarization Is often used to get 
Information about the "bilinear form" (Ax,y) when only 
knowledge of the "quadratic form" (Ax,x) is assumed. 

It la Important to observe that this seemingly in¬ 
nocuous auxiliary theorem uses very essentially the 
complex number system; It and many of Its consequences 
fall to be true In a real vector space. The proof of 
course breaks down at our choice of a = For an 

example: a 90 ^ rotation of the plane clearly has the 
property that it sends every vector x Into a vector 
Ax orthogonal to It. 

As a curiousIty concerning the form (Ax,x) we men¬ 
tion the following fact. The set of all possible values 
of (Ax,x), as X ranges over the unit sphere (l.e. the 
set of all vectors x for which || x || = i ) Is a con¬ 
vex set (In the complex plane) containing all proper 
values of A. For a special kind of transformations 
(namely normal transformations; see §64) this convex set 
la the smallest convex polygon determined by the proper 
values of A. 

§ 54 . HERMITIAN TRANSFORMATIONS 

The three moat important subsets of the complex 
number plane are the real numbers, the positive real 
numbers, and the numbers of absolute value one. We 
shall now proceed systematically to use o.ur heuristic 
analogy of transformations with complex numbers and try 
to discover the analogs among transformations of these 



108 


III. ORTHOGONALITY 


well known numerical concepts. 

When is a complex number real? Clearly a necessary 
and sufficient condition for the reality of C, Is the 
equation < ^ . We ml^t accordingly (remembering 

that the analog of the complex conjugate for linear 
transformations is the adjoint) define a linear trans¬ 
formation A to be real if A - A*. More commonly 
linear transformations A for which A ■■ A* are called 
Hermltian (or symmetric . Hermit Ian symmetric , self-ad ¬ 
joint ). We shall see that Hermltian transformations do 
Indeed play the same role as real numbers: the first 
indication that they are tied up with the concept of 
reality in more ways than through the formal analogy 
that suggested their definition is the following theorem. 

THEOREM. A necessary and sufficient 
condition that a linear transformation A de¬ 
fined on a complex unitary space be Hermltian 
la that (Ax,x) be real for all x. 

PROOF. If A = A* then 

(Ax,x) = (x,A*x) = (x,Ax) - (Ax,x), 
so that (Ax,x) is equal to its own conjugate and la 
therefore real. If conversely (Ax,x) la always real 
then 

(Ax,x) = (Ax,x) = (x,A*x) « (A*x,x), 

so that ([A-A*]x,x) = o for all x, and, by the theorem 
of the preceding section, A = A*. 

This theorem, as well as the theorem used in prov¬ 
ing it, is false in real spaces. (Example?) For, in the 
first place, its proof depends on a theorem that is valid 
only in complex unitary spaces, and, in the second place, 
in a real apace the reality of (Ax,x) (in fact of 
(Ax,y)) la a condition automatically satisfied by all A, 
wliereas the condition A = A*, or, equivalently, (Ax,y) 

« (x,Ay), need not be satisfied. 





ALGEBRAIC COMBINATIONS 


109 


Another proof of the thoroughgoing nature of our 
analogy la this fact: an arbitrary linear tranafonn- 
atlon A may be expreaaed, in one and only one way, in 
the form A « B + 1C, where B and C are Hermit Ian. 
(We ahall refer to B and C as the real and imaginary 
parta of A; the repreaentatlon A =« B + IC la called 
the Carteaian decompoaition of A). For if we write 

(1) B = (1/2)(A + A*), 

C - (1/2l)(A - A*), 

then we have B* ■= (1/£}(A* + A) = B and C* = (- 1 / 2 I) 
(A* - A) - C, and, of course, A = B + 1C. Prom this 
proof of the existence of a Cartesian decomposition Its 
imlqueness Is also clear: If we do have A = B + 1C, 
then A* = B - 1C and consequently A,B, and C are 
again connected hy ( 1 ) and ( 2 ). 

§ 55 . ALGEBRAIC CCMBINATIONS 

It la quite easy to characterize the matrix of a 
Hermltlan tranafonnatlon A with respect to an ortho¬ 
gonal basis I = (X^, Xj^). If the matrix of A 

I 3 ( then we know that the matrix of A*, with 

respect to the dual basis of I , Is ( where 

orthogonal basis la self dioal so 
that we have, (since A = A*), 

(1) a = Of 

We leave It to the reader to verify the converse: If 
( otj^j) Is a matrix satisfying (l) then we may define 
the linear transformation A, by means of this matrix 
and an arbltraiy orthogonal coordinate system I * 

(x^, ..., Xj^), by the ua\ial eqmtlons 

A( “ ^1 ^1*1 * 

Til = Z j “ij 

the condition (l ) implies that the A so obtained is 



no 


III, ORTHOGONALITY 


Hermit Ian. 

As a valuable exercise In the use of the Inner 
product In T3 that we defined In §44 the reader may 
wish to verify that the multiplication operator T, de¬ 
fined In (6) § 20 , Is Hermit Ian whereas the differenti¬ 
ation operator D, defined In (4) §20, Is not. 

The algebraic miles for the manipulation of Her- 
mltlan transformations are easy to remember If we think 
of such transformations as the analogs of real numbers. 
Thus: If A and B are Hermltian, so Is A + B; If 
A Is Hermltian and ex ^ 0 then ocA Is Hermltian If 

and only If oc Is real; and If A has an Inverse then 
A and A”^ are both or neither Hermltian. The place 
where something always goes wrong is multiplication: 
the product of two Hermltian transformations need not 
be Hermltian. However: 

THEOREM 1. If A and B are Hermltian 
then AB and BA are Hermltian If and only 
If they are equal, (i.e. If and only if A 
and B commute). 

PROOF. If AB * BA then (AB)* = B*A* « BA = AB. 
If (AB)* « AB then (AB) « (AB)* « B*A* « BA. 

THEOREM 2. If A Is Hermltian then for 
an arbitrary B, B*AB Is Hermltian; if B 
has an Inverse and B*AB Is Hermltian then 
so Is A. 

PROOF. If A » A*, then (B*AB)* « B*A*B** - B*AB. 
If B has an Inverse and B*AB la Hermltian, then 
every vector x may be written In the form x « By, 
and since 

(Ax,x) « (ABy,By) - (B*ABy,y), 
the reality of the last term for all y Implies the 




§56. NON NEGATIVE TRANSFORMATIONS 


111 


reality of the first term for all x. 

§ 56 . NON NEGATIVE TRANSFORMATIONS 

When Is a complex nimiber c, non negative? Two e- 
qually natural necessary and sufficient conditions are 
that may he written in the form c, = with 

some real t , or that c, may be written In the fom 

= cTCT with any ct. Remembering also the fact that 
the Hermltlan character of a transformation A can be 
described In terms of the fmctlon (Ax,x), we may con¬ 
sider any one of the three conditions below and attempt 
to use It as the definition of a transformation being 
non negative. 

( 1 ) A = with some Hermltlan B. 

( 2 ) A = C*C, with some C. 

( 3 ) (Ax,x) ^ 0 for all x. 

Before deciding which one of these three conditions to 
use as definition we observe that (1 ) =» (2)==» (5). For 
If A = B^, B = B*, then A = BB = B*B, and if A = C*C 
then (Ax,x) = (C*Cx,x) = (Cx,Cx) «= || Cx ^ 0. It Is 
actually true that ( 3 ) Implies ( 1 ), so that the three 
conditions are equivalent, but we shall not be able to 
prove this till later. We adopt as our definition the 
third condition. 

DEFINITION. A linear transformation A 

In non negative . In symbols A ^ 0 , if for all 

X, (Ax,x) ^ 0 . 

More generally we shall write A ^ B (or B ^ A) If 
A-B ^ 0. Although, of course. It Is quite possible that 
the difference of two non Hermltlan transformations Is 
non negative, we shall generally use this notation for 
Hermltlan transformations only. 

Non negative transformations are usioally called 




112 


111 . ORTHOGONALITY 


non negative aeml definite ; If A ^ o and (Ax,x) * o 
Implies X *■ 0, A is called positive definite. Since 
the Schwarz Inequality implies | (Ax,x) | ^ || Ax || • l| x ||, 
we see that for a positive definite operator A, Ax - o 
Implies X « 0, (so that on a finite dimensional space a 
positive definite operator has an inverse). We shall see 
later that the converse Is true: If A 0 and A has 
an Inverse then A Is positive definite. For positive 
definite transformations A wo shall write A > 0; if 
A - B > 0 we also write A > B (or B < A). 

We observe that it follows from the theorem of §5^ 
that if A, on a complex unitary space, is non negative, 
then A is Hermit Ian. 

It is possible to give a matrlclal characterization 
of non negative transformations; we shall postpone this 
discussion until later. In the meantime we shall have 
occasion to refer to non negative matrices, meaning 
thereby matrices ( ^ij) with the property that for 
every set I of n complex nimibers we have 

X j[ 21 j lj[ 4j ^ 0* This condition Is clearly equiv¬ 
alent to the condition that ( matrix, with 

respect to any orthogonal coordinate system, of a non- 
negative transformation. 

The algebraic rules for non negative transformations 
are similar to those for Heimltlan transformations as 
far as sums, scalar multiples, and inverses are concerned; 
even Theorem 2 (555) remains valid if we replace "Her- 
mltlan" by **non-negatlve" throu^out. It is also true 
that if A and B are non negative tlien AB and BA are 
non negative If and only If they are equal, so that A 
and B commute, but we shall have to postpone the proof 
of this statement till later. 

557. PERPENDICTJLAR PROJECTTIONS 

We are now In a position to fulfill our earlier 
promise to Investigate the projections associated with 



PERPENDICULAR PROJECTIONS 


115 


the particular direct sum decompositions 0 « 571 ® 5R^ 

We shall call such a projection a perpendicular pro.lectlcn. 
Since 771-^ Is uniquely determined hy the linear manifold 


W , we need not specify both the direct summands as¬ 


sociated with a projection If we already know that It Is 


perpendicular. We shall call the (perpendicular) pro¬ 
jection E on 571 along simply the projection on 


571 , and we shall write E = P,^ 


THEOREM 1. A linear transformation E 
Is a perpendicular projection If and only If 
E = E^ ■» E*. Perpendicular projections are 
non negative linear transformations and have 
the property that || Ex II ^ II x I for all x. 


PROOF. If E Is a perpendicular projection then 
Theorem 1 (§33) and the theorem of §19 show (after, of 
course, the usual replacements, 577 °—> ,A '—* A*, 

etc.) that E “ E*. Conversely If E = = E*, then the 

Iderapotence of E assures us tliat E Is the projection 
on !R along 77 , where, of course, 57 = 57 (E) and 

71 = 57(E) are the range and null space of E re¬ 

spectively. Hence we need only show tliat 57 and 57 are 
orthogonal. For this purpose let x be any element of 
97 y any element of 57 ; the desired result fol¬ 

lows from the relation 

(x,y) = (Ex,y) = (x,E*y) >= (x,Ey) - o. 

The non negative nature of an E satisfying E » E^ » E* 
follows from 

(Ex,x) - (E^x,x) - (Ex,E*x) = (Ex,Ex) - II Ex 11^ ^ 0. 

Applying this result to the perpendicular projection 
1 - E we see tliat 

8 X ||2 - II Ex - (x,x) - (Ex,x) - ([1-E]x,x) ^ O; 
this concludes the proof of the theorem. 



III. ORTHOGONALiry 


114 


For some of the generalizations of our theory it is 
useful to Icnow that Idempotence together with the last 
property mentioned in Theorem l is also characteristic of 
perpendicular projections. In other words E ■» and 
II Ex II ^ II X II for all x imply E E*. 

PROOF. We are to show that R and 7? are orthogon¬ 
al. If X is in DT'' then y = Ex-x is in 51 since 
Ey «« E^x - Ex = Ex-Ex=o. Hence Ex = x + y with 

(x,y) 0, so that || x || ^ ^ || Ex ||^ - II x | ^ + 

II y I ^ ^ II X 11^, and therefore y = 0. Consequently 
Ex-x so that X is in R ; ^ 51 . Conversely if 

z is in R , Ez = z, we write z » x + y with x in 
51'^ and y in 51 . Then z = Ez = Ex + Ey-Ex-x. 

(Ex - X since x is In H'*’ c 'R ). Hence z is in 
71^ , R «= 71 -^ , and therefore R = Tl"^. 

We shall need also the fact that the theorem of §30 
remains true if the word "projection" is qualified 
throiaghout by "pei^endlcular." This is an immediate con¬ 
sequence of the preceding characterization of perpendicu¬ 
lar projections and of the fact that sums and differences 
of Hennltlan transfonnatlons are Hermltlan, whereas the 
product of two Hermltlan transfonnatlons is Hermltlan if 
and only if they commute. By the methods of unitary 
geometry it is also quite easy to generalize the part of 
this theorem dealing with sums from two to any finite 
nvimber of summands. This generalization is most conven¬ 
iently stated in terms of the concept of orthogonality 
for projections: we shall say that two (perpendicular) 
projections E and F are orthogonal if EP - o. (Con¬ 
sideration of the adjoints shows that this la equivalent 
to FB - 0). That the geometric language is justified is 
shown by the following theorem. 

THEOREM 2. Two perpendicular projections 
E - P^ and F - are orthogonal if and 

only if the linear manifolds 7)1 and 51 (l.e. 




§38. ALGEBRAIC COMBINATIONS 


115 


the ranges of E and P) are orthogonal. 

PROOF. If EP = 0, and if x and j are In the 
ranges of E and F respectively, then 

(x,y) = (Ex,Py) * (x,E*Py) = (x,EPy) = 0. 

If conversely 771 and Tl are orthogonal, (so that 71c 
771'*' ) then the fact that Ex = 0 for x in 777*^ implies 
that EPx = 0 for all x (since Px is in 71 and con¬ 
sequently in 771“^). 

§58. ALGEBRAIC COMBINATIONS OP PERPENDICULAR PROJECTIONS 
The sum theorem for perpendicular projections is now 

easy. 


THEOREM 1. If E^, ..., E^ is a finite 
set of (perpendicular) projections, then E » 

E^ + • • • + .E^ is a (perpendicular) projection 
if and only if Ej.Ej = o for every 1 j, 

(l.e. if and only if the E^ are pairwise or¬ 
thogonal ). 

PROOF. The proof that pairwise orthogonality implies 
that E is a projection is trivial; we prove explicitly 
only the converse so that we now assume that E is a 
perpendicular projection. Then for any x belonging to 
the range of E^, for some fixed i = 1 , ..., n, we have 

B X ||2 ^ II Ex 1|2 = (Ex,x) - (XjEjX.x) = Zj(E^x,x) 

= Ljll EjX II2 - II E^x II2 = II X II2, 

SO that we must have equality all along. Since in par¬ 
ticular we must have 

5 : j II EjX II 2 „ II Ej^x II 2, 

we see that for j 1, EjX «» o. In other words: any 

X in the range of is in the null space (and conse- 



116 


III. ORTHOGONALITY 


quently o]:>thogonal to the range) of every E^, for j ^ 1# 
using Theorem 2 (§57) we draw the desired conclusion. 

The straightforward generalization of (i). Theorem l, 
(§33)> (i.e. the statement obtained from Theorem 1 of 
the present section by omitting the parenthetical clauses) 
is also true, and is most easily proved by considering 
the traces of the summands and the sum; we do not enter 
into this proof here. 

We conclude our discussion of projections with a 
remark on order relations. It is ten^itlng to write 
E ^ P for two perpendicular projections E and 

P = Rp , if 7H c, 71 . Previously, however, we in¬ 
terpreted the sign ^ when used in an expression, such as 
E ^ P, involving linear transformations to mean that 
P - E la a non negative transformation. There are also 
other possible reasons for considering E to be smaller 
than P; we ml^t have || Ex I ^ || Px l| for all x, or 
PE =■ EP ■> E, (see (11), § 30 ). The situation is 
stral^tened out by the following theorem, which plays 
here a role similar to Theorem 2 (§57), i.e. establishes 
the coincidence of several seemingly different concepts 
concerning projections, some of which are defined opera- 
torlally while others refer to the underlying geometric¬ 
al objects. 


THEOREM 2. Por perpendicular projections 
E - P^ and P - Pyj the following four con¬ 
ditions are equivalent. 


(1) 

E ^ P. 

(11) 

II Ex « ^ II Rc II 

(ill) 

771 <i7» 

(iva) 

PE - E. 

(ivb) 

EP - E. 


for all X. 


PR0(}P. We sha.ll prove the implication relations 
(i)=> (li) => (ill) => (lva)^=» (Ivb) => (1). 




§59. UNITARY TRANSFORMATIONS 


ill 


(i)=>(ll). If E^P then for all x 
0 i ([P-E]x-,x) - (Px,x) - (Ex,x) - II Px 11^ - II Ex 11^, 
(since E and P are perpendicular projections). 

(11) =^(111). We assume || Ex || ^ II Px il for all 
X. Let us now take any x In TH ; then we have 

II X » ^ II Px II ^ II Ejc II « II X II, 
so that II Px II » I X II or (x,x) - (Px,x) = o, whence 
([1 - P]x,x) = II (1 - p)x 1^-0, 

and consequently x = Px. In other words x In W1 im¬ 
plies that X is In 71 , as was to be proved. 

(ill) (Iva). If Jn <=■ 'n , then for all x. Ex 
Is In 71 , so that, for all x, PEx = Ex, as was to be 
proved. 

That (Iva) In^jlles, and la In fact equivalent to, 
(Ivb), follows by taking adjoints. 

(lv)z=^(l). If EP = PE = E then for all x 
(Px,x) - (Ex,x) =• (Px,x) - (PElx,x) = (P( 1 -E]x,x), 
Since E and P are commutative projections, so also 
are (l-E) and P, and consequently 0 P(l-E) la a 
projection. Hence 

(Px,x) - (Ex,x) = (Gx,x) «» II Gx ^ 0. 

This completes the proof of Theorem 2 . 

In tenna of the concepts Introduced by now It Is 
possible to give a quite Intuitive soxmding formulation 
of the Theorem of § 30 . Thus: for two perpendicular pro¬ 
jections E suid P, their sum, product, or difference la 
also a perpendicular projection If and only If P Is 
respectively orthogonal to, commutative with, or greater 
than E. 


§59. TJNITARy TRANSPORMATIONS 

We continue with ovir program of investigating the 
analogy between nimibers and transformations. When does 
a complex number C, have absolute value one ? Clearly a 



118 


III. ORTHOGONALITY 


necessary and sufficient condition Is that ? = 
guided by oxir heuristic principle we are led to consider 
linear transfonnatlons U for which U* =• U \ or, 
equivalently, for which UU* ■■ U*U = i. Such transforma¬ 
tions are called unitary . We observe that on a finite 
dimensional vector space either of the two conditions 
TJU* = 1 and U*U = l Implies the other, (see Theorems 
1 and 2 , §24). Concerning unitary transfonnatlona we 
prove the following theorem. 

THEOREM. The following three conditions 
on a linear transfoimatlon U on a finite 
dimensional unitary apace are equivalent to 
each other. 

( 1 ) U la \mltary. 

(2) II Ux II = II X II for all x. 

(3) (Ux,Dy) = (x,y) for all x and y. 

PROOF. If U Is unitary then for all x 

H Ux II ^ (Ux,Ux) = (U*Ux,x) = (x,x) = II X II 
If II Ux II 3 II X M, we may use the Identity 
('^)(x,y) •» (1/4)11 x+y l^-B x-y ||^+i|| x+ly H^-III x-ly l|^. 

Since the right side Is Invariant when we replace x and 
y by Ux and Uy, so la the left. (The Identity (4) 
plays here a role similar to that of the polarization 
Identity (i), § 53 ; It enables us to pass from properties 
of II X I ^ to properties of (x,y).) 

If, finally, (Ux,TJNjr) ■ (x,y) then 
0 = (ll!c,TJy) - (x,y) - (U*Ux,y) - (x,y) - ([U*U-i ]x,y), 

so that U*U “ 1 . The finite dimensionality now assures 
us that DU* Is also 1 , so that U Is unitary. Since 
we have proved the implication relations ( 1 ) => ( 2 ) =^( 5 ) 
( 1 ), the proof of the theorem Is complete; it Is im¬ 
portant to observe that ( 1 ) =» ( 2 ) =* ( 3 ) Is true even 
In non finite dimensional ajiaces. We note also that U~^ 




§ 60 . CHANGE OP ORTHOGONAL BASIS 


119 


and U* are unitary If and only if U la. 

In any algebraic system, and In particular In gen¬ 
eral vector spaces and vinltary spaces. It la of Interest 
to consider the automorphisms of the system: l.e. to 
consider those one to one mappings of the system on It¬ 
self which preserve all relations among Its elements. 

We have seen already that the automojTphlama of a general 
vector space are the non singular linear tranafonnatlona. 
In a unitary apace we require more of an automorphism, 
namely that It also preserve Inner products (and conse¬ 
quently lengths). The preceding theorem shows that this 
requirement la equivalent to the condition that the 
transformation be vinltary. Thus the two questions: 

"What linear transformations are the analogs of con^jlex 
niflnbera of absolute value one?" and "What are the most 
general automorphisms of a unitary apace?" have the same 
answer: unitary transformations. In the following sec¬ 
tion we shall show that unltairy transformations furnish 
also the answer to a third Important question. 

§ 60 . CHANG® OP ORTHOGONAL BASIS 

We have seen that the theory of the passage from 
one linear basis of a vector space to another Is best 
studied by means of an associated linear transformation 
A, (§§31^,35)J the question arises as to what special 
properties A has when we pass from one orthogonal basis 
of a unltEiry space to another. The answer Is easy: 

THE0R!EM 1. If I ® f > 9 ) Is an 

orthogonal basis of the n-dlmenslonal unitary 
space B , and If U Is any unitary trans¬ 
formation on D, then -• ,,, 0 *^^) 

Is also an orthogonal basis of B . Conversely 
If U Is a linear transformation and X an 
orthogonal basis with the property that UI 
Is also an orthogonal basis then U is \mlt€u?y. 


120 


III. ORTHOGOMALITY 


PROOF. Since I la 

an orthonormal set along with X ; It la complete If X 
la, alnce - 0, for 1-1, ..., n, Impllea 

(U*x,Xj^) - 0, whence U*x - x - o. If, converaely, UX 
la a complete orthonormal aet along with X, then we 
have (ll)c,l]y) - (x,y) for all x and y In X , and 
It la clear that by linearity we obtain (Uic,I]y) » (x,y) 
for all x,y. 

Wo obaerve that the matrix of a unitary 

tranaformdtlon U, with reapect to an arbitrary ortho¬ 
gonal baala, aatlaflea the condltlona 

" ‘*lj^ 

and, converaely, any auch matrix together with any or¬ 
thogonal baala, deflnea a vinltary transformation. 

(Proof ?). As an other exercise In the use of unitary 
transformations the reader might prove that a linear 
transformation which aatlaflea any two of the three con¬ 
dltlona of being Involutory, Hermltlan, or unitary, also 
satisfies the third, and that consequently the Involu¬ 
tions associated ((A), §31) with perpendicular projec- 
tlona are also tinltary. 

An Interesting and easy consequence of our consider¬ 
ations concerning unitary transformations Is the follow¬ 
ing corollary of Theorem i, §4i. 

THEOREM 2. Given any linear transfoim- 
atlon A on an n-dlmenslonal unitary space 
n , there exists an orthogonal basis X in 
T) such that the matrix [A; X 1 has the 
atqper diagonal fom; or, equivalently, given 
any matrix [A], there exists a unlteuy matrix 
[U] such that [U'^KAKU] Is svqperdlagonal. 

PROOF. In the derivation of Theorem 2 from Theorem 
1 (In §4^1) we constructed a (linear) basis 



§61. CAYLEY TRANai'ORM 


121 


- (x^, ...» x^) with the property that x^, Xj 

lies In !)7lj spans WJj for j « i, ..., n, euad 

showed that In this basis the matrix of A Is superdl- 
agonal. If we knew that this basis Is also an orthogonal 
basis we could apply Theorem l of the present section to 
obtain the desired result. But It Is easy to make X 
Into an orthogonal basis even If It Isn’t one already: 
this Is precisely what the Gram-Schmidt orthogonallzatlon 
process (§48) can do. Here we use a special property of 
the Gram-Schmidt process, namely that the J-th element 
of the orthogonal basis It constructs Is a linear com¬ 
bination of x^, ..., Xj and lies therefore In 571 j. 

We observe that In most of the preceding sections of 
this chapter we have treated complex lonltary spaces. All 
of our theorems, past and future, have Interesting and 
Important analogs In the real case. In real vector 
spaces unitary tranafonnatlons are called orthogonal emd 
Hexmltlan ones symmetric . The main difference between 
the two disciplines Is caused by the algebraic closure 
of the complex field. The closest coimterpart of alge¬ 
braic closure in the real case Is the theorem that every 
polynomial can be factored Into at most second degree 
pieces. Althougih we shall not treat the problems arising 
from this difference, we hope that by the time he has 
reached the end of this book the Interested reader will 
be In a position where he will be able to apply the 
methods and results of our woric to this more delicate 
study. 


§61. CAYLEar TRANSFORM 

The classes of conQilex nimibers, tdiose transfonnatlon 
analogs we have been studying have certain algebraic re¬ 
lations to each other. One such relation Is given by the 
complex valued function a - ^ of the real variable 
T . This function maps the entire real line, - oo < 

T ^ oo , In a one to one way, on the unit circle minus 


122 


III. ORTHOGONALITY 


the point a « 1 . (The geometric picture is quite easy: 

T « 0 becomes a = -i, and the line is just wrapped 
around once to cover the circle.) The inverse mapping 
is given by T = ^ j j . We shall show that the 

best possible analog of this result is valid in trans¬ 
formation theory. 

THEOREM. If A is any Heimitian trans- 
fomnation on a finite dimensional unitary 
space D , then both the transformations 
A ± il (1 = V^) have Inverses; the trans¬ 

formation U defined by 

( 1 ) U= (A+ 11 )“’ (A-n) = (A-il ) (A+il)”'' 

(called the Cayley transfoim of A) is uni¬ 
tary and falls to have the number 1 for a 
proper value. Consequently U -1 has an in¬ 
verse and 

(2) A-{(u-ir’ (u+i)= { (u+i) (u-ir\ 
Conversely, if U is any unitary transform¬ 
ation which falls to have the number 1 for a 
proper value then ( 2 ) defines a Heimitian trans¬ 
formation A and the Cayley transfoim of A 

is U. 

PROOF. For any transformation A we have 

((A ± 11 )x,(A±ll )y) = (Ax,Ay) ± (Ax,ly) ± (lx,Ay) +(x,y). 

Hence If A Is Heimitian we obtain, by taking x - y, 

B (A + 11 ) X ||2 = l(A-11) X - II Ax + II X II2. 

In other words (A ± 11) x ■= o Implies x «=« o, so that 
both A ± 11 have Inverses and the definition (l) makes 
sense. 

For any vector x we write y ■= (A + 11 )”^x, so 
that X ■» (A + 11 )y; then 

I U)c II - B (A -11 )y II - fi (A + 11 )y B - B X B; 

U la unitary. Since, moreover, tlx x Implies 
(A - 11 )y - (A + 11 )j so that y - x ■> o, 1 la not a 



61 . CATgiBY TRANSFORM 


123 


proper value of U and, therefore, U-l does have an 
Inverse. Finally we have 

X - tfic = (A + 11)y - (A - 11)y = 2ly, 

X + Ux - (A + 11 )y - (A - 11 )y - 2Ay, 

so that A(x - Tlx) *• 2lAy ■» l{x + Tlx); this establishes 
the validity of (2). 

Let us now go backwards. Starting with TJ we de¬ 
fine a transfonnatlon A by (2). For any pair of vec¬ 
tors X and y we write x' = (TJ-i)~’x,y' = (TJ-1)~V/ 
so that X => (TJ-l)x',y “ (TJ-l)y'. Then Ax = j(TJ+l)x', 
so that 

(Ax,y) - Y (Tlx* + x», Uy' - y') 

- j KTfic»,T^') + (xSTJyO - (TJx*,y') - (x',y')l 
= { Kx',TJy') - (TIx',y')l - l(Tlx',y')+ l(TJy',x*). 

Since Interchanging x and y replaces the last tenn 
of this relation by Its own complex conjiagate. It has 
the same effect on the first term, and consequently 

(Ax,y) - (Ay,x) >= (x,Ay), 

so that A Is Hemltlan. 

If for any x we write y = (TJ-l )~^x, so that x •» 
(TJ-1 )y, then Ax = j(TJ+l )y and therefore 

(A + 11 )x - ^ KU+1 )y - (U-1 )yj - -j-2y, 

(A - 11 )x - -j l(U+i )y + (U-1 )yl = j2Dy. 

Consequently 

U(A + 11 )x - j 2T^ - (A - 11 )x; 

this establishes the validity of (l) and concludes the 
proof of the theorem. 

It Is woirth remarking that even this theorem, as In¬ 
timately as It may appear to be tied up with 1 'T^, 

has a very natural analog In the real case. For If we 
write B - lA then U- (B-l)"’(B+l) and, cAearly, 

B* -■ -B. Linear transfoimatlona with this latter prop- 



III. ORTHOGONALITY 


1 

erty are called akew-Heimltlan (or, in the real case, 
skew-SOTnetrlc ). The natural and valid real analog of 
the theorem of this section establishes a correspondence, 
similar In every detail to the one we described, between 
orthogonal and skew-s 7 inmetrlc transfonnatlona. 

§ 62 . PROPER VALUES OP HERMITIAN 
AMD UMITARY TRANSFORMATIONS 

The analogy between numbers and transformations Is 
supported even more than before by the following results 
which assert that the proisertles which caused us to de¬ 
fine the special classes of transformations that we have 
been considering are reflected by their spectra. 

THEOREM 1. If A Is Hermltlan then 
every proper value of A Is real, and more¬ 
over, A Is non negative or positive definite 
If and only If all Its proper values are non 
ne®itlve or positive respectively. 

PROOF. If Ax » Ax, with x ^ 0 , then since A 
la Hennltlan (Ax,x) la real and consequently 

iMzil _ A - A 

II X B 2 II X II 2 

Is also real. The same proof establishes the statement 
concerning non negative transformations; the result for 
positive definite transformations follows from the fact 
that such a transfonnatlon must have an Inverse and can¬ 
not therefore have the proper value zero. 

THEOREM 2 . Every proper value of a uni¬ 
tary transformation has absolute value one. 

PROOF. If Ux - AX, X y 0 , then || x | # Ox | - 

I A I • I X II. 



S6g. PROPER VALUES 


125 


THEORY 5. If A l3 either Hezmltlan or 
vinltary then proper vectors belonging to dif¬ 
ferent proper values are orthogonal. 

PROOF. Suppose Ax^ = Ax^ - ^2*2' ^ ^2* 

Then If A Is Hemltlan we have 

(1) A^Cx^jXg) - (Ax^,X 2) = (x^^AXg) - A2 (x^,X2). 

(The middle step makes use of the Hermltlan character of 
A and the last step of the reality of A^). In case A 
la tinltary (i) la replaced by 

(2) (x^^Xg) = (AX^,AX2) - ( A,/ A2)(X^,X2), 

(using the fact that '^ 2 ^* ^ either case 

(x^,Xg) 0 Implies = Ag, so that we must have 

(x,,X2) = 0. 

THEOREM If a linear manifold fix 
reduces the unitary trsmafomnatlon U, on a 
finite dimensional unitary space, then so 
does 

PROOF. Considered on the finite dimensional linear 
nanlfold Wl, U Is a linear transfonnatlon with the 
property (Ux,Uy) = (x,y); hence It Is a unltazy trans¬ 
formation on fix and as such has an Inverse. Conse¬ 
quently every x In Wl may be written In the fonn 
X with y In Wl; In other words x In Wl im¬ 

plies that y«»U~^x Is In Wl. Hence Wl reduces U ^ - 
U*. It follows from Theorem 2, §33, that Wl^reducea 
(U*J* . u. 

We observe that the same resvilt for Hemltlan trans- 
fomatlons (even In not necessarily finite dimensional 
spaces) la trivial, since If Wl reduces A then Wl"*" re¬ 
duces A* A. 


THB0RHI4 5. If A Is either a Hemltlan 


126 


III. ORTHOGONALITY 


or a unitary tranafonnatlon on an n-dlmen- 
slonal unitary space B, then the algebraic 
multiplicity of any proper value of A 

la eqioal to its geometric multiplicity, l.e. 
to the dimension, say m, of the linear mani¬ 
fold Vn of all solutions of Ax = 

PROOF. We shall use only the property described in 
Theorem 4 so that we may simultaneously establish the re¬ 
sult for both the Hennltlan and the mltary case. 

It la clear that Wl, and therefore reduces A; 

let us denote by A^ and Ag the linear transformation 
A considered only on Wl and JTl'^ respectively. By 
choosing a basis (x^, ..., x^) in B so that x^, ..., 
Xjjj are in Wl and Xjjj^^, ..., x^^ are in 5W^, we see 
that for all A 

A (A - A1 ) = A (A, - A 1 ) • A (Ag - A 1 ). 

Since A^ la a linear tranafoimatlon on an m-dlmenslonal 
space with only one proper value A^, A^ must occur as 
a proper value of A^ with the algebraic multiplicity 
m, so that h(A^ - Al) = ( A^ - A )™. Since on the 
other hand A^ is not a proper value of Ag, so that 
A(Ag - A^l) ^ 0, we see that A(A - Ai) contains 
( Aq - A ) as a factor exactly m times, as was to be 
proved. 

§ 63 . SPECTRAL THEOREM FOR HERMITIAN TRANSFORMATIONS 

We are now ready to prove the main theorem of this 
book, the theorem of which most of the other results of 
this chapter are innedlate corollaries. To a large ex¬ 
tent what we have been doing up to the present was a 
matter of sport (useful, however, for generalizations): 
we wanted to show how much can conveniently be done with 
spectral theory before proving the spectral theorem. 

The spectral theorem, incidentally, can be made to follow 





trivially from the auperdlagonallzatlon process we have 
already described: because of the Importance of the 
theorem we prefer to give below Its (quite easy) direct 
proof. The reader may find It profitable to adapt the 
method of proof (not the result) of Theorem 2, §4i> to 
prove as much as he can of the spectral theorem. 


THEOREM. To any Hennltlan linear trans¬ 
formation A on an n-dimensional unitary 
space there corresponds an Integer p, i ^ 
p ^ n, p perpendicular projections E^,...,Ep 
(different from zero) and p numbers 
«l> •••» “p with the following properties: 

(1) the Ej are pairwise orthogonal, 

(2) the aj are pairwise different, 

(3) Xj ® j “ 1 > 

(4) AEj = Ej A for j = 1, ..., p, 

(5) ^j^j ^ 

The a*s and E's are uniquely determined by 
the conditions (i) - (5). The representation 

(5) Is the spectral form of (A); some of Its 
further Important properties are: 

(6) the are exactly the distinct 

proper values of A and are consequently real; 

(7) the dimension of the range of Ej 
Is the multiplicity of a j; 

(8) a linear transformation B commutes 
with A If and only If It commutes with 
each Ej. 


EROOP. Let a^, ..., be the different proper 

values of A, and let “I/ ..., p, be the perpen¬ 

dicular projection oii the linear manifold of all solu¬ 
tions of Ax ■» ttjX. Thus (2) and the first i>art of (6) 
are satisfied by definition; the second i)art of (6) fol¬ 
lows from Theorem l, § 62 , and (7) from Theorem 5# § 62 . 



128 


III. ORmOGQNALITY 


Prom Theorem § 62 , we obtain ( 1 ) €uid also ( 3 ). (For 
( 1 ) guarantees that ^ is & perpendicular pro¬ 

jection; If It were not ■■ 1 then A considered on the 
range of 1 -£ would be a linear transfonnatlon with no 
proper values). The truth of (4) follows from the fact 
that each reduces A; It romalns to prove ( 3 ), (8), 

and uniqueness. 

For any vector x we write Xj - EjX; then Xj Is 
In Wj (so that Exj " *j^ ““i consequently Axj -• 
otjXj, j ■■ 1* ..•» p. It follows that 

ta - AdjSjX) - Ijtej - I-f J - 

this Is precisely the statement of ( 5 ). 

One half of (8) Is clear from ( 3 ): If B commutes 
with each Ej It also commutes with A. Suppose on the 
other hand that B commutes with A, and consider any 
fixed Ej . B surely commutes with all polynomials In 
A, so thaS we will achieve our purpose If we can show 
that Ej^ Is a polynomial In A. To do this, let us 
use ( 3 ) to see what a polynomial In A will look like, 
we have 

f? - (Ii-iEiXUj a|Ej 

(since E^j - 0 If 1 j and Ej - Ej); similarly 
A^ - ]Elj ^ Ej for every positive Integer n, and hence 
for any polynomial p( t), 

p(A) - “j)®j* 

Now we are done: all we need to do Is to find a poly¬ 
nomial p( T ) which is such that p( a ,) - 0 for all 
j ^ Jq, and p( “Jq) " for a^ch a p( t), p(A) - Ej 
We may for example choose ° 

p(T). TT . 

To prove finally that the representation ( 5 ) Is 
unique, we shall assume (1) - ( 3 ) and show first that 




S63« spectral THBQRai 


129 


the Oj are necessarily what we defined them to he In 
the existence proof, namely the distinct proper values 
of A. If X Is any vector In the range of any Ej, so 
that EjX - X and E^x - o for i. i i, then 

Ax - Oj^Ej^x - 

so that each Is a proper value of A. If converse¬ 
ly A Is any proper value of A, say Ax - Ax, x ’f o, 
then we write Xj -■ EjX and we see that 

Ax Ax •* aXjXj • 

and 

te - AZjXj - Zj «.J*J 

30 that Sl®Lce the Xj are palzwlse 

orthogonal, those among them that are not zero form a 
linearly Independent set. It follows that for each j 
either Xj <= O or else A ot^. Since x ^ o, we have 
Xj 7 * 0 for some j, and consequently A Is Indeed one 
of the a'a. The rest of the uniqueness proof follows 
from the proof of (8): there we showed, using only (l) 
and (3), that each Ej Is a polynomial in A, and that 
this polynomial Is determined hy the a*a. This ccmi- 
pletes our proof of the spectral theorran. 

Before exploiting this theorem we remark on Its 
matrlclal Interpretation. If we choose an orthogonal 
basis In the range of each Ej, then the totality of the 
vectors In these little bases forms a basis for the 
whole apace: expressed In this basis the matrix ( oc^j) 
of A will be diagonal . l.e. The fact 

that by suitable choice of an orthogonal basis the matrix 
of a Hermit Ian transformation can be made diagonal, or, 
equivalently, that any Heimltlan matrix [A] can be 
unltarlly transformed (l.e. replaced by [U] ^[A][U]) 

Into a diagonal matrix, already follows of course from 
the superdlagonal form. We gave our operatorlal version 
for two reasons. First It is this version which geneir- 
allzes easily to the Infinite dimensional case and. 



130 


III. ORTHOGONAUTY 


second, because we believe that even In the finite di¬ 
mensional case, writing ^‘l^l great nota- 

tlonal and typograxihlcal advantages over the matrix nota¬ 
tion. 

We shall also make use of the fact that a not neces¬ 
sarily Hemltlan linear transformation A la imltarlly 
dlagonable (l.e. that Its matrix with respect to a suit¬ 
able orthogonal basis la diagonal) If and only If con¬ 
ditions (1) - (5) of the theorem of the present section 
hold for It. For If we have (l) - (5) then the proof of 
dlagonablllty, given for Hermltlan transfonoatlona, ap¬ 
plies; the converse we leave as an exercise for the 
reader. 


§64. NORMAL TRANSFORMATIONS 

We have seen that every Hermltlan treuisformatlon la 
dlagonable and that an arbitrary transformation A may 
be written as A - B + 1C with B and C Hezmiltlan; 
why Isn't It true that simply by diagonalizing B emd 
C separately we can diagonalize A? The answer Is, of 
course, that dlagonallzatlon Implies the choice of a 
suitable orthogonal basis and there Is no reason to ex¬ 
pect that a basis idilch diagonalizes B will have the 
same effect on C. It Is of considerable lmpoi>tance to 
know the precise class of transformations for which the 
theorem of the preceding section Is valid, and fortunate¬ 
ly thia class Is easy to describe. 

We shall call a linear transformation A normal If 
It commutes with Its adjoint, AA* A-^A. We point out 
first that A Is normal If and only If Its real and 
Imaginary parts commute. For suppose that A Is normal 
and A - B 1C with B and C Hermltlan; since 
B - (l/2)(A+A*) and C - (l/2l)(A-A*) It Is clear that 
BC o CB. If conversely BC - CB then the relations 
A > B -f 1C,A* - B - 1C Imply that A Is normal. We 
observe that Hermltlan and imltary transformations are 




§64. NORMAL TRANSFORMATIONS 


131 


normal. 

The class of'transfonnatlons satisfying ( 1 ) - ( 5 ) 
of §63 Is precisely the class of normal transformations. 
A half of this statement Is easy to prove: If A »» 

51 j then A* = Xj ® takes merely a 

simple computation to show that AA« - A*A - 
To prove the converse, l.e. that normality Implies the 
existence of a spectral form we have two alternatives. 

We could derive this result from the spectral theorem 
for Heimltlan transformations, using the real and im¬ 
aginary parts of A, or we could prove that the essen¬ 
tial lennas of § 62 , on which the proof of the Hermltlan 
case rests, are just as valid for an arbitrary normal 
operator. Because Its methods are of some Interest we 
adopt the second procedure. We observe that the machin¬ 
ery to prove the lemmas that follow was available to us 
In § 62 , so that we could have stated the spectral theo¬ 
rem for normal operators Immediately; we travelled the 
present course In order to motivate the definition of 
normality. 


THEOREM 1 . If A Is normal, then x 
Is a proper vector of A If and only If It 
Is a proper vector of A*; If Ax = Ax then 
A*x - Xx. 

PROOF. We observe that the normality of A Im¬ 
plies 

( 1 ) n Ax 11^ - (Ax,Ax) - (A*Ax,x) - (AA*x,x) 

- (A*x,A*x) =• II A*x 11^. 

Since A - M la normal along with A, and since 
(A- M )* - A* - A 1 , we obtain the relation 

( 2 ) I Ax - AX II - II A*x - Ax 11/ 

from idilch the assertions of the theorem follow Immedi¬ 
ately. 


THEOREM 2. If A is normal then 
proper vectors, belonging to different proper 
values are orthogonal. 

PROOF. If Ax^ ,AXg •• ^2*2' then 

A,(x,,X 2) - (Ax^^Xg) - (x,,A*X 2) - Ag(x^,Xg). 

This theorem generalizes Theorem 3 of § 62 ; in the 
proof of the sijectral theorem we needed also Theorems 4 
emd 5 of § 62 , The following result takes the place of 
the first of these. 

THEOREM 3 . If A is normal, A is a 
proper value of A, and 571 is the set of all 
solutions of Ax Ax, then both 571 and 57l'‘' 
reduce A. 

PROOF. That 571 reduces A we have seen before. 

To prove that TR'*’ also reduces A it is staffIclent to 
preve that 771 reduces A*. This is easy: if x is in 
571 then 

A(A*x) » A*(Ax) A(A*x), 

so that A*x is also in 571. 

This theorem seems to be much weaker than its core 
respondent in § 62 . It is tree that in a floilte dimen¬ 
sional tonltary space A 571 c 571 In^allea A TTl"^ c 571"^ 
for a normal A, and it is even tree that this property 
is characteristic of noimallty. (We shall not prove 
this: the reader may verify that it is a consequence of 
the spectral theorem for noimal opea?ators, which we 
shall prove). The most important thloig to observe, how¬ 
ever, is that the proof of Theorem 5 of §62 depended 
only on this weak property: the only manifolds that need 
be considered are the ones of the type mentioned in the 
precedliog theoi^em. 

This concludes the sjoade woik: the spectral theorem 



S64. NORMAL TRANSPORMATTnws 


Ill 


for normal operators follows just as before in the Her- 
mltlan case. If in the statement of the theorem of 563 
we replace the word "Hermltian" hy "nonnal" and delete 
the reference (In (6)) to the reality of the proper 
values, the rest of the statement and all of the proof 
remain unchanged. 

It Is the theory of nonnal operators that Is of 
chief Interest in the study of unitary spaces. Concern¬ 
ing nonnal operators It Is useful to observe that spec¬ 
tral condltlozxs of the type given In Theorems 1 and a of 
§ 62 , there shown to be necessary for the Hermltian, vinl- 
tary, etc. character of a tranafonnatlon, are for nonnal 
operators also sufficient. Thus: 

THEOREM 4. A normal transformation A 
with spectral form A - 51 4 la ( 1 ) Her- 

mltlan, ( 2 ) non negative, ( 3 ) positive def¬ 
inite, (4) unitary, (5) non singular, ( 6 ) 

Iden^jotent If and only If all the are 

(T) real, ( 2 ') non negative, ( 3 ') positive 
(4*) of absolute value one, (5*) not zero, 

(6’) zero or one. 

PROOF. Since we know that the Oj are the proper 
values of A we know also (J) Implies (j•) for 
j - 1, ..., 6. Since A* - Zj “j®j» 

Inqplles (1). If ttj ^ 0 then for any x we have 

(Ax,x) - Zj " ^j “j* Ejx II® 0 , 

so that ( 2 ') Implies { 2 ) and also ( 3 ') Implies (3).(Prooft 
Zj «j II EjX H® - 0 Implies that for each j either 
ttj - 0 or EjX -• 0 , and since «j 0 , EjX = 0 for 
j - 1, ..., p, so that X » 21 j® j* “ 0) • To prove that 
(4') Implies (4) we observe that (4’) Implies 

AA* “ A^A • Z j I ^ j^ ^j j^ j " ^ * 

If «j 0 for j - 1 , ..., p, we may form the linear 


III. ORTHOGOMALITY 


_Lii 

transformation B •• 21j V clear that AB - 

BA “ 1, so that A is non singular. Finally A^ •» 

Ej so that if aj = otj for j - 1, ...,p, 
then A® « A. 

We observe that the Implication relations 
(5') (5 ), (2) (2'), and (3') (5) together 

prove an Assertion we made in 561; if A is non negative 
and non singular then it is positive definite. 

§65. FUNCTIONS OF NORMAL TRANSFORMATIONS 

One of the most useful concepts for nonnal operators 
is that of a fmction of an operator. If A la a nor¬ 
mal linear transfoimatlon with spectral form A = 

and if f ( C ) la an arbitrary complex valued 
fmotion of the complex variable C,, defined at least 
for c, = J “ 1/ then we define a linear 

transformation f(A) by 

f(A) = 5:jf( ttjOEj. 

Since for polynomials (and even for rational fmetIona) 
p( <) we have already seen that our earlier definition 
of p(A) yields, for a normal A, p(A) = XjP( 
we see that the new notion la a generalization of the old 
one. The advantage of considering f(A) for arbitrary 
fmctlona f la for us largely notatlonal: it intro¬ 
duces nothing conceptually new. For we may write for an 
arbitrary f( C “j) “ Pj» and then we may find a 

I)olynomlal p( c, ) which at the finite set of distinct 
complex numbers otj takes, respectively, the values (3j. 
With this polynomial p( <; ) we have f(A) - p(A), so 
that the class of transformations defined by f(A) is 
nothing essentially new: it only saves the trouble of 
constructing a polynomial p( <; ) to fit each special 
case. Thus for example if for every complex number A 
we define f■ 1 if - A , f^ ( C) “ o other¬ 
wise, then f^ (A) “ the perpendicular projection on the 



§66, PROPEEq?IES OF NON NEGATIVE TRANSFORMATIONS 13 ^ 


linear manifold of solutions of Ax = Ax. 

We observe that If then (assuming of 

course that f ( q ) Is defined for all d y i.e. that 
a . ^ 0) f(A) = A“\ and if f(c) = ? then f(A) - A*. 
These statements imply that if f ( C ) is an arbitrary 
rational function of C and ^, we obtain f(A) by the 
replacements Q —> A, ^ ^ A*, ^ /C, —> A’^ oc —> a 1 . 

The symbol f(A) is, however, defined for much more 
general functions and we shall in what follows make free 
use of expressions such as e^ and iIK. 

As an exercise in the use of the functional calcu¬ 
lus the reader may wish to prove the theorem of §6i 
(concerning the Cayley transform of a Hermltlan trans¬ 
formation) by considering the function f ( C ) = - 4 — r . 

1C ^ 

Consideration of the fimctlon f ( C ) = e ^ shows slml- 

lA 

larly that for every Hermltlan A, U = e is unitary, 
and that conversely every unitary U has the form 
U rs with a Hermltlan A. 

§66. PROPERTIES OF NON NEGATIVE TRANSFORMATIONS 

A particularly Important fimctlon is the square root 
of non negative operators. We consider f(C) = ^ 

defined for all real C ^ o as the non negative square 
root of C , and for every non negative A = 51 j “ 
ocj ^ 0, we consider 

f(A) 

It is clear that Va ^ o and that ( Va)^ - A; we 
should like to investigate the extent to which these 
properties characterize Va. At first glance it may 
seem hopeless to look for any uniqueness since if we con¬ 
sider B = ± V o£j Ej, with an arbitrary choice of 

sign in each place, we still have A = B^. The Va we 
constructed, however, was non negative, and we can show 
that this additional property guarantees uniqueness: in 
other words A - B^, B^ 0, implies B = Va. For let 



136 


III. ORTHOGONALITY 


B = Xjf spectral foim of B; then 

pX = b" ■= A = Xj «jE.. 

Since the are distinct and non negative so also are 

the the uniqueness of the spectral form of A im¬ 
plies that each is equal to some otj (and con¬ 

versely), and that the corresponding E*s and P*3 are 
equal. By a permutation of the indices we may therefore 
achieve Pj “ ccjj j = U so that Pj = 

as was to be shown. 

There are several important applications of the 
existence of square roots for non negative operators of 
which we now give two. 

First: we recall that in §56 we mentioned three 
possible definitions of a non negative transformation 
and adopted the weakest one, namely that (Ax,x) ^ 0 for 
all X, The strongest of the three possible defini¬ 
tions was that we could write A in the form A =» 
with a Hermltian B; we point out that the result of 
this section concerning square roots implies that the 
(seemingly) weakest of our conditions implies and is 
therefore equivalent to the sti*ongest, (In fact we can 
even achieve a unique non negative BJ) 

Second: in §56 we stated also that if A and B 
are non negative and commutative then AB Is also non- 
negative; we can now give an easy proof of this asser¬ 
tion, The commutativity of A and B implies that any 
two of the transformations A, B, Va, Vb commute with 
each other; consequently 

AB= Vat/a Vb Vb- VaVb Va/b-( Va Vb)®. 

since Va and Vb are Hennltlan and commutative, their 
product Is Hennltlan and therefore Its square Is non 
negative. 

The spectral theory also makes It quite easy to 
characterize the matrix (with respect to an arbitrary 
orthogonal coordinate system) of a non negative trans- 




§66, PROPERTIES OF NON NEGATIVE TRANSFORMATIONS 137 


formation A. Since the determinant A (A) la the 
product of the pfoper values of A It Is clear that 
A ^ 0 Implies A (A) ^ 0. If we consider the defining 
property of non negativeness expressed In terms of the 


matrix of A, l.e. 




i 0, 


we observe 


that this last expression remains non negative if we 
restrict the coordinates f •••> requiring 

that a certain number of them vanish. In terms of the 
matrix this means that If we cross out columns numbered 
•••> 3Q-y> cross out also the rows bearing 

the same niimbers, the remaining small tnatrlx is still 
non negative, and consequently so is Its determinant. 

This fact is usually expressed by saying that the 
principal minors of the determinant of a non negative 
matrix are non negative. The converse is true; the co¬ 
efficient of the j-th power of A in the characterlstjc 
polynomial A (A- A1 ) of A is (except for sign) the 
sum of all principal minors of n-j rows and columns. 

The sign Is alternately plus and minus; this implies 
that If A has non negative principal minors and is 
Hermitlan (so that the zeros of A (A- Ai ) are known to 
be real) then the proper values of A are non negative. 
(Proof ? ) Since the Hermitlan character of a matrix Is 
ascertainable by observing whether or not the elements 
c^lj are Hermitlan symmetric ( ij =“ ^ji^' 
ments reduce the problem of finding out whether or not 
a matrix Is non negative to a finite number of elementary 
computations. 

Using the above characterization of non negatlve- 
ness^the reader may verify that If ^ [q q] ® ” [o u 
and If C Is a Hermitlan matrix for which both A ^ C 
and B ^ C then 

1 + € Tj€( l + e !) 6 

1 + e',. 

where € ^ o j and | 6 It Is also easy to show 



III. ORTHOGONALITY 


138 

that^for ty/o matrli^a 0 ^^ -"asad of the type of C, 

^ Cg can hold mleaaVfC^ modem ter¬ 

minology these facts together show that Hermltian 
matrices with the ordering induced by the notion of non 
negativeness do not form a lattice . Restricting atten¬ 
tion to the real case and Interpreting a matrix 
as the point f a, p,y i in three dimensional space, 
the ordering and its non lattice character take on an 
amusing geometric aspect. 

§67. POLAR DECOMPOSITION 

There is another useful consequence of the theory 

of square roots, namely the analog of the polar repre- 
i a 

aentatlon C = pe of a complex nimiber. 

THEOREM 1. If A ±3 an arbitrary 
linear tranafonnatlon on a finite dimen¬ 
sional xuiitaiy 3i)ace b > then there l3 a 
(vailquely determined) non negative trans¬ 
formation P, and a unitary transformation 
U, such that A =■ UP. U is imiquely deter¬ 
mined by A if and only if A is non 
singular. 

PROOF. Although it la not logically necessary to do 
so we shall first give the pi>oof in the case where A 
has an inverse: the proof in the other case is an ob¬ 
vious modification of this proof, which gives greater 
insist into the geometric structure of arbitrary trans¬ 
formations. 

Since the transformation A*A is non negative we 
may find its (unique) non negative square root, P - lfA*A. 
We write V >= PA” ^; since VA P the theorem will be 
proved if we can prove that V is unitary, for then we 
may write U » V“’. Since V* *• (A”b* P* “ (A*)”’ P, we 
see tliat 



POLAR DECOMPOSITION 


1 


V*V = (A*r^ PPA”'' = (A*)"’ A*AA"^ = 1 , 

30 that (since D is finite dimensional) V is unitary, 
and we are done. To prove uniqueness we observe that 
UP = ^0^0 PU* = ^0^0* therefore 

P^ = PU*X]P = P„U*U„P„ = P^^ . 

O O O O O 

Since the non negative transformation = P^ has only 
one non negative square root it follows that P = P^. 

(In this proof we did not use the fact that A has an 
Inverse). If A is non singular then so is P (since 
P = U'^A), and from this we obtain (multiplying the re¬ 
lation UP = Uq^o ri^t by P”*^ = ) U *= U^. 

We turn now to the general case, where we do not 
assume that A’^ exists. We form P exactly the same 

p 

way as in the preceding proof, so that P = A*A, and 
then we observe that for any vector x we have 

II Px II ^ (Px,Px) = (P^x,x) = (A*Ax,x) = II Ax 11 ^. 

If for each vector y in the range ^^(P) of P,y = Px, 
we define Uy «= Ax, the transformation U is length pre¬ 
serving wherever it is defined. We must show that U is 
uniquely determined: i.e. that Px^ = Px^ implies AXj- 
AXg. This is true since ■P(x^-x^) = 0 is equivalent to 
II P(x^-X2) II ~ 0 and this latter condition implies 
II A(x^-X2) II = 0 . If we define U on the orthogonal 
complement of 'R(P) to be, say, the Identity, then the 
transformation U, thereby determined on all H , is 
unitary and has the property that UPx = Ax for all x. 
In other words A = UP, as was to be proved. Incidental¬ 
ly, the extent of non uniqueness of U is clear from 
the proof, in which at one place we were free to make an 
almost entirely arbitrary choice for U. (As long as U 
is unitary, its behavior on ( !R(A))‘^ is immaterial). 

Applying the theorem just proved to A* in place 
of A, and then taking ad joints, we obtain also the dual 
fact that every A may be written in the form A « PU 
with a unitary U and a non negative P. 



140 


III. ORTHOGONALiry 


In ^ometrlc language this theorem la sometimes 
stated in the following form: every linear treinsfonna- 
tlon on B la effected by a dilatation followed by a 
rotation, (The justification for the temlnology is 
clear from the diagonal fomns of the matrices of non 
negative and unitary tranaformationa). The reader mi^t 
give an alternative proof of this theorem, for the 
special case of noimal transfoznatlons, by using the 
spectral fonn. 

In contrast with the Cartesian decomposition 
we call the representation A •» UP the polar decompo ¬ 
sition of A; in terms of this decomposition we obtain 
a new characterization of nonnallty. 

THEOREM 2. If A >= UP is the polar 
decomposition of the linear transformation 
A then a necessary and sufficient condition 
that A be normal la that UP - PU. 

PROOF. (Since U la not necessarily uniquely de¬ 
termined by A this statement is to be interpreted as 
follows: if A is normal then P commutes with every 
U, and if P commutes with any single U then A la 
normal). 

Since AA* - UP^U* - UP^U"’ and A*A - P®, it is 
clear that A is normal if and only if U commutes 
with P^. Since, however, P^ is a function of P and 
conversely P is a function of P^, (P - "/p^ )> it 

O 

follows that commuting with P is equivalent to com¬ 
muting with P. 

568. PROBLEMS OP COMHOTATIVrry 

The spectral theory of normal operators and the 
functional calculus may also be used to solve certain 
problems concerning commutativity. This is a deep and 




_ 868. PROBIiBIS OP COMMUTATIVITY _ Ul 

extensive subject: more to Illustrate the method than 
for the actual results we discuss two theorems from It. 

THEOREM 1. Two Hemltlan transfomia- 
tlons A and B on a finite dimensional 
unitary si>ace are commutative If and only 
If there exists a Hermltlan transfonnatlon 
C 8uid two real valued functions of a real 
variable, say f and g, such that A = 
f(C), B ■» g(C). If such a C exists then 
we may even choose C In the form C ■> 
h(A,B), where h Is a suitable real valued 
fimctlon of two real variables. 

PROOF. The sufficiency of the condition la clear; 
we prove only the necessity. 

Let A = B = (ijPj be the spectral 

fonns of A and B; since A and B commute It follows 
from (8), §63, that and P^ commute. Let h( 3 ,t) 
be any function of the two real variables a and t for 
which the numbers h{ Pj) = sre all distinct, 

and write C - h(A,B) - ^j^^ “l' 

clear that h may even be chosen as a polynomial, and 
the same will be true about the functions f and g we 
are about to describe). Ijet f and g be such that 
f( and g( - p, for all 1 and j. 

Then f(C) - A and g(C) - B, and everything la proved. 

THEOREM 2. If A la a normal trans¬ 
formation on a finite dimensional unitary 
space and If B la an arbitrary transforma¬ 
tion idilch conmutes with A then B conniutes 
with A*. 

PROOF. Let A - 27^ “i®i spectral form of 

A; then A* -> Let f ( c; ) be such a function 


III. ORTHOGONALITY 


1 kz 

(polynomial) of the complex variable C, that f( - 
for all 1. Then A* f(A) and the theorem fol¬ 
lows. 

Theorem 2 Is remarkable In two ways. It asserts In 
the first place a kind of transitivity for comnutatlvlty. 
It Is not true In general that If A^ commutes with A 
and A with B then A^ will comnlute with B; but for 
A^ - A* this Is precisely the statement of Theorem 2. 

The other remarkable feature of this theorem Is Its re¬ 
luctance to be generalized: Its truth or falsity has 
not yet been decided for a very large class of operators 
In Infinite dimensional spaces. 

§ 69 . HERMITIAN TRANSFORMATIONS OP RANK ONE 

We have already seen (Theorem 2, § 38 ) that every 
linear transformation A of rank p Is the sian of p 
linear transformations of rank one. It Is easy to see 
(using the spectral theorem) that If A Is Hermltlan, 
or non negative, then the summands may also be taken Her- 
mltlan, or non negative, respectively. We know (Theorem 
1 , § 38 ) what the matrix of a transformation of rank one 
has to be; what more can we say If the transformation Is 
Hermltlan or non negative ? 

THEORiM 1. If A has rarik one and Is 
Hermltlan (or non negative) then In every 
orthogonal coordinate system the matrix 
of A has the form with a 

real k, (or the form ■> 
versely, [A] has this form In a single orth¬ 
ogonal coordinate system then A has rank one 
and Is Hermltlan (or non negative). 

PROOF. We know that the matrix ( “ij) of & trans¬ 
formation A of rank one, in any orthogonal coordinate 



§ 69 . HEEaCTIAH TRANSFORMATIONS OP RANK ONE 


143 


Pi yy 


system £ ■> has the form 

If A Is Hemltlan we must also have 5^ lAence 

Pi yj “ pj If for some i, - O and ^ o 

then for all 


1 Tl« 

d i, 


Pi yj/ “ 0, whence A - O. 

Since we assvimed that the rank of A Is one (and not 
zero) this Is impossible. Similarly ^ 0 and yj^ “ o 
la Impossible; l.e. we can find an 1 for which Pi yi 
0. Using this 1 we have p^ ■■ ( Pi/ 
with some constant x Independent of j. Since the diag¬ 


onal elements a. 


‘JJ “ " '"j -0 

matrix are real, we can even conclude that In thla case 

a 1 j « ^ ^ with a real k • 

If, moreover, A is non negative, then we even laiow 
that ^ ** negative, and 

therefore so la k. In thla case we write A « /k # 
and the relation k Pj_ Pj “ ( Ap^)( APj) shows that a ^j 


Pj of a Hermltlan 


has the form ot 


ij 


y^ 


1 


It Is easy to see that these necessary conditions 
are also sufficient. If “ij = >«Pi Pj with a real » 


If a 


ij- 


and 


then A Is Hermltlan. 

^1 ^ 1^1 

(Ax,x) - «ij fi Ij - Ii^ j ^1 y^ fi ij 


^^1 ^1 ^1^^ ” lUi yi ^il^ o» 


so that A Is non negative. 

As a consequence of Theorem l it Is very easy to 
prove a remarkable theorem on non negative matrices. 


THE0RB4 2. If A and B are non nega¬ 
tive linear transfoimatlons whose matrices In 
some orthogonal coordinate system are ( a ^j) 
and ( Pj^j) respectively then the linear 
transformation C, whose matrix ( y^j) ^ 
this some coordinate system Is given by ■* 

“ij ^Ij ^ 


144 


III. ORTHOGONALITY 


atlve. (The matrix ( is called the 

Hadamard oroduct of and ( Bij).) 

PROOF. Since we may write both A and B as a 
sum of non negative transformations of rank one, so that 

“ Zp “i “j ^’ij " Pl^f> 

^Ij “ ^p^q“l f’l^ “ j 

Since a sum of non negative matrices Is non negative. It 
will be su fficient to prove that, for each fixed p and 
q, (a^ (B^)( o(^ (3^) defines a non negative matrix: and 
this follows from Theorem l. This proof shows by the 
way that Theorem 2 remains valid If we replace "non neg¬ 
ative* by "Heimltlan" In both hypothesis and conclusion; 
for later applications, however. It Is only the actually 
stated version that will be useful to us. 

§ 70 . CONVERGENCE OP VECTORS 

Essentially the only way In which we exploited, so 
far, the existence of an Inner product In vinltary spaces 
was to Introduce the notion of a normal tz«uisformatlon. 

A much more obvious circle of Ideas Is the study of the 
convergence problems that arise In a unitary space. 

Let us see what we ml^t mean by the assertion that 
a sequ^ce {Xj^l of vectors In 13 converges to a vec¬ 
tor X In B . There are two possibilities that suggest 
tliemselves: 

(1) n Xj^-X I — ♦ 0 as n—* 05 ; 

(11) (Xjj-o^y) —» 0 as n —>00 , for each fixed 
y In B . 

If (1) Is true then we have for every y 

l(*n-x*y)l i I Xq-x I • II y H —► 0 , 

so that (11) Is true. In a finite dimensional space tlie 
converse Implication is valid: (11) =4(1). To prove 




571. BOUND OP A LINEAR TRANSFORMATION 


145 


this let z^, zjf be an orthogonal basis In D . 

(Often In the remainder of this chapter we shall wrltb 
N for the dimension of a finite dimensional vector 
si>aee. In order to reserve n for the dummy variable In 
limiting processes). If we assume (11) then, for each 
1-1, ..., N, (x^ —* 0. Since (Theoi?em 2, 54?) 

II x^-x - 5Ii l(Xn“*»*l)l^ 

It follows that II II —* 0, as was to be proved. 

Concerning the convergence of vectors (In either of 
the two equivalent senses) we shall use without proof 
the following facts. (All these fapts are easy conse¬ 
quences of oiar definitions and the properties of con¬ 
vergence In the usual domain of complex numbers: we 
assume that the reader has a modicum of familiarity with 
these notions). 

The expression ox + f)y la a continuous function 
of all Its arguments simultaneously; l.e. If f a^| and 
{ are sequences of complex numbers and jx^j^} and 

ly^^l are sequences of vectors, then —»a , (5^^—, 

x^_>x,yj^—»y. Implies that o(^x^+♦ ax+By. If 
jZj^l Is an orthogonal basis In D , x^^ - “ln*l* * “ 
^l“l^l» ^^ only If » «j^(aa 

n —» oo ) for each 1 - 1, ..., N. (Thus the notion of 
convergence here defined coincides with the usual one In 
N-dlmenslonal complex Euclidean space). Finally we shall 
assume as known the fact that a finite dimensional uni¬ 
tary space D with the metric || x-y H Is complete: l.e. 
If {Xjjl Is a sequence of vectors for which || | 

—» 0 as n,m— * <x>, then there Is a unique vector x 
such that Xq—> x as n—»ao. 

571. BOUND OF A LINEAR TRANSFORMATION 

The metric properties of vectors have certain Im¬ 
portant Implications for the metric properties of linear 
transfomatlons, which we now begin to study. 



146 


III. ORTHOGONALITY 


IWINITION. A lliiear transformat^lon A 
on a unitary apace ID is bounded If there 
exists a positive finite constant K such 
that for every vector x In B, l|Ax||^ 

K II X II. The greatest lower bound of all 
constants K with this property Is called 
the botind of A, In symbols III A I; It la 
clear that for every x, I Ax | ^ HI A B • I x | . 

For examples we may consider the cases where A la 
a perpendicular proj^ectlon ^ o or a unitary tremafonna- 
tlon: Theorem l of *657 and the theorem of 559 respect¬ 
ively Imply that In both cases M A || = l. Consideration 
of the vectors x^^^ *■ Xj^(t) s t^ In p shows that the 
differentiation operator la not bounded. 

Because In the sequel we shall have occasion to 
consider quite a few upper and lower bounds similar to 
B A IH we Introduce a convenient notation. If P is 
any possible property of real numbers t, we shall denote 
the set of all real numbers t x>ossesslng property P 
by the symbol {t : P|, and we shall denote greatest 
lower bovind and least vqpper bound by Inf (for Inflmum) 
and sup (for supremum) respectively. In this notation 
for example 

II A II •> Inf IK : | Ax | ^ K I x | for all xi. 

The notion of boundedness Is closely connected with 
the notion of continuity. If A la bounded and If e 
Is any positive number, by choosing 6 > A I we may 
make sure that | x-y 1^6 Implies 

II Ax-Ay II - I A(x-y) II ^ | A B • | x-y I ^ ; 

In other words boundedness inplles (uniform) continuity. 
It la true that the beat possible converse of this re¬ 
sult Is valid: continuity of A at any single point 
Implies that A Is boxmded and consequently uniformly 
continuous over the whole space. Since we shall not need 




§72. EXPRESSIONS FOR THE BOTJKD 


UT 


this result we leave Its proof to the Interested reader: 
we turn rather to the proof that In the case of chief 
Interest to us boundedness Is always present. 

THEORIM. Every linear transformation 
A on a finite dimensional unitary space 
73 Is bounded. 

PROOF. Let (x^, ..., Xjj) be an orthogonal basis 
In 73 and write 

Kq max I li Ax^ I1 ^ ^ ii II I • 

Since an arbitrary vector x may be written In the 
form X “ we obtain, applying the Schmarz 

Inequality and remembering that II II “ l ^ 

II Ax II - II A( X^(x,x^)Xj^)l! = II ^^(x,Xj^)Ax^ P 

^ 5;;j^| (x,Xj^)| • II AXj^ « ^ 

^ r 1 II X II . II Xj^ II . II Ax^ II 

i Kq^Ii « X II = NK^ II X II . 

In other words K = NK^ Is a boimd of A. 

It Is no accident that the dimension N of B 
enters Into oiir evaluation: we have already seen that 
the theorem Is not true In non finite dimensional spaces. 

§72. EXPRESSIONS FOR THE BOUND 

To facilitate working with the bovind of a trans- 
fomatlon we consider the following four expressions: 
p - sup III Ax P /II X II : X 0], 

q - sup {P Ax I : || x P « l i, 

r - supjl (Ax,y)|/| x P • II y P : x ^ o, y ^ o|, 

s - aup||(Ax,y)| : II x || - | y || - 1|. 

Di accordance with our definition of the brace notation 
the expression f I Ax II : P x II - l L for example, means 
the set of all real nunibers of the fom II Ax P, con- 




III. ORTHOGONALITY 


U8 

aldered for all x's for which II x || 1). 

Since II Ax B ^ K II X H is trivially time with any 
non negative K if x «« 0, the definition of sup 
implies that p » ||| A |||; we shall prove that in fact 
p = q ..i r B a ■■ m A III . Since the supremum in the ex¬ 
pression for q is being extended over a subset of the 
corresponding set for p, (l.e. if || x I! = i then 
n Ax II / I X II «■ II Ax I ), we see that q p; a similar 
argument shows that s ^ r. 

For any x ^ 0 we consider y » x/|| x ||, (so that 
B y II - 1 ); we have B Ax || /1| x || - || Ay I . In other 
words every number of the set whose supremum is p, oc¬ 
curs also in the corresponding set for q; it follows 
that p q, and consequently p = q “ III A |||. 

Similarly if x ^ o and y / 0 we consider 
x' « x/|| X II and y* = y/fl y II; we have 

|(Ax,y)|/|| X II • II y B = l(Ax*,y')|, 

and hence, by the argument just used, r ^ s, so that 

r ■» s. 

To consolidate our position: we have proved so far 

that 

p = q - II A II , r “ a. 

Since 



it follows that r ^ p; we shall complete the proof ny 
showing that p ^ r. For this pui^ose we consider any 
vector X for which Ax ^ 0 (so that also x o); for 
su(di an X we write y «• Ax and we have 

II Ax II / I X II l(Ax,y)|/ B X K • II y B. 

In other words we proved that every number which occurs 
in the set defining p, and which is not zero, occurs 
also in the set of which r is the supronum: this 
clearly implies the desired result. 

The numerical function IB A III of operators A sat- 




§73. BOUNDS OP A HERMITIAN TRANSFORMATION 


11*9 


Isfles 

the following four relations 

(1) 

« 

A + B'II 

^ 11 A M 

■MliBB , 

(2) 

11 

ABM £ 

lAffl • 

IBM, 

(3) 

II 

aA I »» 

1 al • 

IIAH , 

(4) 

m 

A* III - 

1 A III . 



The proof of the first three of these is immediate from 
the definition of a bound; for the pj?oof of (4) we use 
the equation ID A III -• r, as follows. Since 

I (Ax,y)| - |(x,A*y)| ^ || x || • ( A*y || 

^ IB A* 11 . II X II • II y I, 

we see that III A III ^ II A* || ; replacing A by A* and 
A* by A** •» A we obtain the reverse inequality. 

§73. BOUNDS OP A HETOOTIAN TRANSFORMATION 

As usual we can say a little more about the special 
case of Remit Ian transformations than in the general 
case. We consider for any Hermltian transformation A 
the sets of real numbers 

- |(Ax,x)/| X 11^ ’’ X ^ Oj, 

* - |(Ax,x) : I X I - 1 I . 

It is clear that ’*’=§>. If for every x ^ o we 
write y - x/| x | then I y |l - l and (Ax,x)/| x || ^ - 
(Ay,y), so that every number in § occurs also in * , 
and consequently ^ ♦ . We write 

o -1 inf § inf 'I' , 

P ■ sup $ » sup * ; 

a is the lower bo\md and P the upper bound of the Her- 
mltlan transfomatlon A. If we recall the definition of 
non negative transfomatlons we see that a is the great¬ 
est real number for wdiich A - ot i ^ o, and p is the 
least real number for which pi - A ^ o. Concerning 
these numbers we assez>t that 

'Z - max {|aU I Pll ■* HAI. 



150 


III, ORTHOGONALITY 


PROOF, Since !(Ax,x)| ^ # Ax || • 1x1 
^ II A II • 1 X 11^, it is clear that | a I and I fl I are 
both ^ III A III, To prove the reverse inequality we ob¬ 
serve that the non negative character of the two linear 
transformations - A and + A implies that both 

( + A)*( - A)( + A) - ( ^ 1 + A)( - A)( -^1 +A) 

and 

( - A)*( + A)( - A) « { - A)( + A)( "V1 -A) 

are non negative, and therefore so also is their simi 
2*7 ( - A^). Since ^ « 0 implies 111 A I « 0 the 

theorem is trivial in this case; in any other case we 
may divide by 2y and obtain the result that ( - A^ 

^ 0, In other words 

- 7 ^ 1 X 11^ « ^^(x,x) ^ (A^x,x) = II Ax 11^, 

whence ^ ^ I A 1|, and the proof is complete. 

We call the reader’s attention to the fact that the 
computation in the main body of this proof could have 
been avoided entirely. Since *71 - A and 7f1 + A are 
both non negative, and since they commute, we may con 
elude Immediately (§66) that their product *7^1 - A^ is 
non negative. We presented this roimdabout method in 
accordance with the principle that, with an eye to the 
generalizations of our theory, one should avoid using 
the spectral theory whenever possible. Our proof of the 
fact that the non negativeness and commutativity of A 
and B imply AB ^ 0 was based on the existence of 
square roots for non negative transformations. This 
theorem can also be proved by so called "elementary** 
methods, i.e. methods not using the spectral theorem, 
but even the simplest elementary proof Involves compli¬ 
cations which are purely technical and for our purposes 
not particularly useful. 



§74. MINIMAX PRINCIPLE 


151 


§74. MINIMAX PRINCIPLE 

A very elegant and useful fact concerning Hermltlan 
transformations Is the following mlnlmax principle , due 
to R. Courant. 

THEOREM. Let A be a Hermltlan trans¬ 
formation on an n-dimensional unitary space 
*0 , and let ^n neces¬ 

sarily distinct) proper values of A, with the 
notation so chosen that A- > A. > • • • > A^. 

For any linear manifold 7R In 13 we write 

)i( JR ) = sup HAx,x) : X In 771 , || x || = 1 ! 
and for k =• 1, ..., n we define 

= Inf f M ( TR ) : dimension of 7R * n-k+1 |. 

Then = A^ for k = ..., n. 


PROOF. Let x^, 
13 for which Ax 


., Xj^ be an orthogonal basis In 


1 ^ 1^1 * ^ ^ ^ ^ ^ (§ 65 )^ let 

7t\ ^ be the linear manifold spanned by x^, ..., x^. 


1, 


n. Since the dimension of 97?, Is k, 97?, 


cannot be disjoint from any n-k + 1 dimensional 
linear manifold 97? In D ; If 97? Is any such manifold 
we may find a vector x belonging to both 97?^ and 97? 


1 . For this X 


^1=1 


X II ‘ 


and such that 
have 

(Ax,x) ■“ X Aj^l ^j^l ^ I 

so that M( W ) ^ Ajj.. 

If on the other hand we consider the particular 
(n-k + 1 )-dlinen 3 lonal linear manifold Tfl spanned 

^ In 


we 


'kf 


by Xj^, then for any x = 

this manifold we have (assuming II x B = i) 

- lA *1' i - ''k' * - V 

so that M ( (WIq^ ^ ^k* 

In other words, as 971 runs over all n - k + i 
dimensional linear manifolds, |i( 971 ) is always ^ Aj^ 



and Is at least once ^ this shows that « A^^, 

as was to be proved. 

In particular for k « 1 we see (using §73) that 
for a Hermitian transformation A, | A II is equal to 
the absolute value of the greatest proper value of A. 
That this is not true for all Ij^e^r transformations is 
seen by considering the matrix [o 0/ . (See comment at 
the end of §55. This particular matrix has practically 
no properties and is very useful, for this reason, in 
the construction of counter examples). 

§75. CONVERG0ENCE OF LINEAR TRANSFORMATIONS 

We return now to the consideration of convergence 
problems. There are three obvious senses in which we 
may try to define the convergence of a sequence lA^^t of 
linear transformations to a fixed linear transformation 

A. 

(1) IIA^-AI—^O asn-^oo, 

(ii) I A^x - Ax||—^0 aan-->cx> for each fixed x. 
(ill) I (A^x,y) - (Ax,y)l —> 0 as n —♦ cao for each 
fixed X and y. 

If (1) la true then for every x 

n AjjX - Ajc II - I - A) X 8 ^ III Ajj - A III • || x fl —♦ 0, 

so that (1) =>(11). We have already seen (§ 70 ) that 

(II) => (111) and that In finite dimensional spaces 

(III) (11). It Is even true that In finite dimen¬ 
sional spaces (11) (1), so that all three conditions 

become equivalent. To prove this, let x^, ..., Xj^ be 
an orthogonal basis In D ; we suppose that (11) Is valid. 
Then for any c > o we may find an n^ -• n^( e ) such 
that for n ^ n^, 8 A^Xj^ - Axj^ 8 i ® 1 - i, ...,N. 

It follows that for an arbitrary x - w® 

have 

I (Aj^ - A)x II - II 5;;j^(x,Xj^)(Aj^ - A)Xj^* 

X II • II (An - A)x^ II ^ eNl x ||, 





§75. CONVERGENCE OP LINEAR TRANSFORMATIONS 


151 


and this Implies (1). 

It la also easy to prove that, using III A - B ||| as 
a distance for operators, the resulting metric apace Is 
complete, l.e. that If II ^ H —* ° 
then there la an A such that I - A ||| —* 0. The 
proof of this fact la reduced to the corresponding state¬ 
ment for vectors. If H ^ H then for each 

X, II Aj^x - AjjjX II —* 0, so that we may find a vector cor¬ 
responding to X which we may denote by, say. Ax, such 

that II ApX - Ax II —» 0. It la clear that the corres¬ 
pondence from X to Ax is given by a linear trans¬ 

formation A; the Implication relation (11) =>(111), 
proved above, completes the proof. 

Now that we know what convergence means for linear 
transforaiatlona It behooves us to examine some simple 
functions of these transformations In order to verify 
their continuity. We assert that III A III, | Ax I , (Ax,y), 
Ax, A + B, aA, AB, and A* are all continuous functions 
of all their arguments simultaneously. (Observe that the 
first three are nvimerlcal valued functions, the next Is 
vector valued, and the last four are operator valued). 

The proofs of these statements are all quite easy, and 
similar to each other; to Illustrate the Ideas we discuss 
IB A III , Ax, and A*. 

(1 ) If A|^ —» A, l.e. Ill Aj^ - A IB —> 0 then since 
the relations 

II Aj^ III ^ II Aj^ - A IB + I A III , 

Bl A HI ^ IB A - Aj^ n + Bl A^ II , 

imply that 

11 A^ III - II A III ^ III A^ - All! , 

we see that III Aj^ II —> II A IB . 

(2) If A^ —» A and x^^ —► x then 

II AjjXj^ - Ax I ^ I A^x^ - AXj^ II + II Ax^ - Ax II —♦ 0 
so that AjjXjj—» Ax. 



III. ORTHOGONALITY 




( 3 ) If — > A then for each x and y 

(A*j^x,y) - (x,/^y) = (Aj^y,x) —♦ (Ay,x) 

= (y,A*x) •» (A*x,y), 

whence A^^* —► A*. 

§ 76 . ERGODIC THEOREM FOR UNITARY TRANSFORMATIONS 

The routine work being out of the way, we illustrate 
the general theory by considering some very special but 
quite Important convergence problems. The first of 
these is the ergodic theorem for unitary transformations. 

THEOREM. Let U be a unitary trans¬ 
formation on a finite dimensional unitary 
space 13 , and let JTl be the linear manifold 
of all solutions of Ux = x. Then the se¬ 
quence = 11 /nX 1 +U+* • converges as 

n —>00 to the perpendicular projection 

n 

PROOF. Let 73 be the reuige of the linear trans- 
fonnatlon 1 - U. For any x in 71,x = y- we 
liave 

V^x = (l/n)(y - Oy + Uy - U^y + ... + ~ 0°y) 

“ (l/n)(y - U^), 

30 that 

II V^x II = (l/n)ll y - tfV II i (i/n)( II y || + II tfV II ) 

,, 2 II T H 

n 

Hence for x in 71 , Y^x converges to zero. 

On the other hand if x is in 773, i.e. Ux = x, 
then VjjX = x, so that in this case V^^x certainly con¬ 
verges to X. 

We shall complete the proof by showing that 71'*’ = 773 
(Tliis will imply that every vector la a sum of two vec- 




§77 • POWER SERIES 


155 


tors for which converges, so that converges 

everywhere. What we have already proved about the limit 
of In DTI and 71 shows that V^x will always con¬ 

verge to the projection of x Into 7n ). To show that 
77-^ = DTI we observe that x Is In the orthogonal com¬ 
plement of 71 If and only If (x,y - I^) = o for all y. 
This In turn Implies that 

0 = (x,y - Uy) - (x,y) - (x,Uy) = (x,y) - (U*x,y) 

== (x - U*x,y), 

l.e. that X - U*x = x - U“^x Is orthogonal to every 
vector y, so that x - U”^x = 0,x = U'^x, or Ux = x. 
Reading the last computation from rl^t to left shows 
that this necessary condition Is also sufficient; we need 
only recall the definition of DTI to see that TH = 77*^ 

This very Ingenious proof, which works with only very 
sll^t modifications In most of the important Infinite 
dimensional cases, Is due to P. Rlesz. As an amusing 
exercise, which will show how one might have been led to 
think of the Rlesz proof, the reader may wish to give an 
alternative proof based on the spectral theorem for 
unitary operators. 


§77. POWER SERIES 

We consider next the so called Neumann series, 

^ OO “iq 

2. A , where A Is a linear transformation of bound 
< 1 on a finite dimensional vector space. We write 



then 


(2) (1 - A)Sp = Sp - ASp = 1 - aP+’. 

To prove that 3p has a limit as p —>ao we consider 
(for any P > q) 

I3p- S,l 

Since I A Ml < 1 the last written number approaches zero 



156 


III. ORTHOGONALITY 


as p,q—»ao ; it follows that Sp has a llnjlt S as 
p—» oo • To evalmte this limit we observe that l - A 
has an inverse, since (1 - A)x •• o implies that Ax = x 
and consequently implies (unless x = o) the impossibility 

II Ax II = II X II > III A III • II X II . 

Hence we may write ( 2 ) in the foim 

(3) Sp = (1 - aP+’)( 1 - A)"’ = (1 - Ar\l - aP+M; 

since aP'*’^ —» 0 as p —>00 , it follows that S = 

(1 - A)"’. 

As another example of infinite series of operators 
we consider the exponential series. For an arbitrary 
linear ti«nsfonnatlon A (not necessarily with III A III < 1 ) 
we write 

Sp= 

Since we have 

IIISp - S^IH ^ 'll A 111“ 

and since the rlg^t member, being a iiart of the power 
series for exp( III A Ml ) = el" ^ , converges to zero as 

p,q—» oo, we see that there la a linear transformation 
3 such that Sp —* 3. We write 3 «» exp(A); we shall 
merely mention some of the elementary properties of this 
function of A. 

Consideration of the superdlagonal forms of A and 
Sp, shows that the proper values of exp(A) are, including 
multiplicities, the exponentials of the proper values of 
A. From the consideration of the superdiagonal form it 
follows also that the determinant of exp(A), i.e. 

exp( where ..., are the (not neces¬ 

sarily distinct) proper values of A, is the same as 
exp( A^ + ••• + Aj^) - exp(T(A)). Since acp( ^ 0, 
this shows incidentally that exp(A) is always non singu¬ 
lar. Considered as a function of A, exp(A) retains many 
of the simple properties of the ordinary numerical ex¬ 
ponential function. Let us, for example, take any two 



§77• POWER SERIES 


137 


commutative linear transformations A and B. Since 
exp(A + B) - exp(A) exp(B) is the limit (as p — > cd ) of 
the expression 




we will have proved the multiplication rule for exponen¬ 
tials when we have proved that this expression converges 
to zero. (Here (^) stands for the combinatorial coef¬ 
ficient ■ M *^ ^ easy verification yields the fact 

j*\n J/*- w 

that for k + m ^ p, A jo occurs in both terms of the 
last written expression with coefficients which differ 
only in sign; the terns that do not cancel out are all in 
the subtrahend and are together equal to 

y y ^ 

^ m m!kl ' 

the summation being extended over those values of m 
and k which are £ p and for which m + k )> p. Since 
m + k > p implies that at least one of the two Integers 
m and k is greater than the Integer part of p/2 (in 
symbols [p/2]), the bound of this remainder is dominated 
by 

^,^0^-lc-Tp/al Ekl'Al"*®." 

sifT IAI 

- n «*•”'< Z^[p/2, ^ HIBI >') 

* (ZlS. E! 'BH ’‘><r”jp/j, i, lAI ") 

- exp( I A in ) otp + exp( Rl B III ) Pp, 

where oc^ —> o cind Pp— > 0 as p —► co . 

Similar methods serve to discuss f(A) where f( O 



158 


III. ORTHOGONALITY 


is any function representable by a power series, 

O «n 

and where ||| A tl is smaller than the radius of conver¬ 
gence of the series. We leave it to the reader to veri¬ 
fy that the fimctlonal calculus we are here hinting at 
is consistent with the functional calculus for normal 
transformations. Thus for example exp(A) as defined 
above is the same linear transformation as is defined by 
our previous notion of exp(A) for normal A*s. 




APPENDIX I 


THE CLASSICAL CANONICAL FORM 

The chief difference between the first two chapters 
and the third chapter of this book Is that In the third 
chapter we picked out, by means of properties of the 
Inner product In a imitary space, certain special classes 
of transformations and concentrated our attention on 
them. These classes are all subclasses of the class of 
normal transformations, and, using the spectral theorem, 
we see that the chief virtue of these transformations Is 
that their structure Is completely known If we know what 
one dimensional linear manifolds they leave invariant 
(l.e. what their proper vectors are). In order to under¬ 
stand the structure of not necessarily normal transform¬ 
ations we must study hl^er dimensional Invariant linear 
manifolds. It turns out that the results here are of 
the same degree of difficulty In a general vector space 
as In a unitary space; we return accordingly to the con¬ 
sideration of arbitrary linear transformations In arbi¬ 
trary (finite dimensional) vector spaces. The one re¬ 
striction that we retain is that the scalars should be 
elements of an algebraically closed field such as the 
field of complex numbers. We begin by discussing two 
seemingly Irrelevant and rather special notions (quotient 
space and nllpotent transformation); these concepts are, 
however, useful for studying the structure of linear 
transformations as well as in many other parts of the 
theory. 

If 13 Is any vector space and Jn Is any linear 
manifold In D we say that two vectors x and y of 
13 are congruent modulo , In symbols x ss y ( W ), 


159 



APPENDIX I 


16o 

If X - y la In Wl. (This notion of congruence la a 
very cloae analog of the correapondlng notion In number 
theory, according to which two Integera are called con- 
ginient modulo m If their difference la a multiple of 
m.) We obaerve that x s o ( 97} ) la equivalent to x 
being In 97} , and that x^ * y^ ( 99 } ) and Xg = yg 
( 99 } ) Imply that a^x^ + ot^x^ ^ «^y^ + ct^y^ ( 97} ). 
For any x^ In D let ua denote by x* the aet of all 
vectors x of D for which x = x^ ( 97} ). (Since we 
are not going to consider unitary spaces In this appendix 
we are at liberty to uae the star In a different sense 
from the customary one). Let us denote by X)» •• 93 / 97} 

the class of sets x* (called congruence or residue 
classes) so obtained; we shall Introduce linear opera¬ 
tions Into 93* In such a way that 93* will become a 
vector space. We define a^x* + = ( a^x^ +ci^x^)*; 

the only thing we must make sure about Is that this def¬ 
inition ^^nlquely detennlnes ( a^x^ + HgXg)*. If, In 
other words, y^ = x^, and y| = x*, then we must be sure 
ttiat ( a^y, + = (a^x^ + ot^Xg)*. This la true: 

the reader may verify that It Is implied by the linearity 
of the congruence relation. The space 93 * la called 
the quotient space of 93 modulo 97} . This notion Is 
very rich In Interesting properties which we shall not 
explore; we hope tliat the techniques we have developed 
In tills book will enable the reader to ask and answer 
the relevant questions, such for example as the relation 
of quotient spaces to conjugate sjMces, annlhllatora, etc. 
We merely describe the little concerning linear trans- 
fozmatlons that we shall have occasion to use. 

If A Is a linear transformation on 93 , and If 97} 
Is a linear manifold Invariant under A, we define a 
linear transfoimatlon A* on 93* by A*x* - (Ax)*. It 
Is again easy to verify tliat A* la \mlquely defined. 
Wliat we are interested In Is this: If A happens not 
only to be reduced Taut also to be completely reduced by 



THE CLASSICAL CANONICAL FORM 


161 


jn , 30 that A becomes the direct sum 9 Ag of 
two linear transfonnatlona defined on the subspaces 77 } 
and W of B respectively, then what Is the relation 
between Ag and A* ? Both these tranafonnatlons can be 
considered as complementary to A^; A^ describes what 
A does on 771, and both Ag and A* describe In dif¬ 
ferent ways what A does elsewhere. 

Let X and y be any two elements of 71, and 
consider the corresponding elements x* and y* of 
B * “ B /7n . If It should happen that x* ■■ y*, 
so that (x - y)* «« 0, then this means that x - y Is 
In 771 and 77 at the same time and consequently x y. 
Since on the other hand every element x of “D has the 
foim X = y + z, with y In 771 and z In 71 , we see 
that X* = z*, so that every x* In B* Is the star 
of some element In 71 . In other words the correspond¬ 
ence X Tx = X* Is a one to one correspondence be¬ 
tween 91 and B *; It follows from the definition of x* 
that T Is a linear transformation. (Thus we obtain In 
particular that 71 and B* are Isomorphic vector 
spaces). Let us now compare the transfonnatlon Ag (l.e. 
A on 71 ) with A*. If AgX - y, then A*x* « (Ax)* - 
y*; In other words A*Tx = Ty « TAgX. This linplles that 
A*T ■> TAg or A* - TAgT"\ Loosely speaking (see § 35 ) 
we may say that A* transfoims B* the same way as Ag 
transforms 77 . In particular all characteristic fea¬ 
tures (such as proi>er values. Invariant manifolds, etc.) 
are shared by Ag and A*; as linear transformations 
they are abstractly Identical (Isomorphic). We shall 
exploit this fact presently; at the moment we turn to 
our next topic. 

A linear transformation A Is called nllpotent of 
Index q If A*^ = 0 and A**”’ ^ 0 , for some positive 
Integer q. Concerning nllpotent transformations we 
shall need the following not particularly exciting but 
quite useful theorem. 


APPENDIX I 


I6a 


THEOREM 1. If A Is nilpotent of 

Index q, and x Is any vector for which 
1 ^ 

4 0, then the vectors 

x^,Ax^, x^ are linearly Indepen¬ 

dent, If we write for the linear mani¬ 
fold 3i)anned by these vectors then there exists 
a linear manifold ^ such that D « © R 

and such that A is completely reduced by 
the pair ( , R ). 


PROOF. We prove first the asserted linear inde¬ 
pendence. If ^ i first in¬ 

teger for which ot- ^ 0. (We do not exclude the possi¬ 
bility j = 0). Then dividing throu^ by ’ “ j 
changing notation in an obvious way we obtain a relation 
of the form 


Aijr = y q-.i a aK „ A'J+’fY ‘i"’ « ) = a«5+’v 

A Xo - 2.i=j+i ofjL* x^ - A t2-i_j+i«iA Xq) - A y. 

It follows from the definition of q that 


A^l-^: 


A^" = A% = 0; 


since this contradicts the choice of x^, we must have 
= 0 for each j. 

It is clear that H reduces A; to construct R we 
go by Induction on the index q of nllpotence. If 
q - 1 the theorem is trivial; we assume the result for 
q-1. The range 9R of A is a linear manifold which re¬ 
duces A; on , A is nilpotent of index q-1. We 
write « Pi n and y^ » Ax^; then is 

spanned by the linearly independent vectors 

.-.y A^ ^y^. The induction hypothesis may be ap¬ 
plied: 71 is the direct sum of and some other 

o 

linear manifold R^, and A R c R 

O' o o* 

We write R^ for the set of all vectors x for 
which Ax is in it is clear that is a lln- 



THE CLASSICAL CANONICAL FORM 


163 


ear manifold. The temptation I 3 great to set R = 
and to attempt to prove that this R has the desired 
properties, but unfortunately this need not be true: 
fi and R^ need not be disjoint. (It Is true, 
althou^ we shall not make explicit use of this fact, 
that the Intersection of R and R^ la contained In 
the null space 71 of A). That In spite of this R^ 

Is useful Is caused by the fact that R + ♦ 

For If X Is any vector then Ax Is In 7} and conse¬ 
quently Ax = y + z with y in and z in R^. 

The general element of R^ la a linear combination of 
AXq, ..., a'^'^x^; hence we have 

y = Z ill “ ^1-0 “ l^^^o) " ^^1» 

where y^ la In R. Consequently Ax = Ay^ + z, 

A{x - y^) = z, so that A(x - y^) la In R^. This 
means that x - y^ la In R^, so that x Is the sum 
of an element (namely y) of and an element of R^. 

As far as disjointness is concerned we can say at 
least that R n R^ = 0 . For If x Is In R , then 

Ax Is In Rq. Since Is Invariant under A, Ax la 

In Rq along with x; consequently If x Is common to 
R and Rq then Ax Is common to R^ and R^, so 

that Ax “ 0 . But for an element x of R, Ax = 0 

Implies that x Is In R^. (If x =Z!i_o 
Ax = y Of. ,A^x„ = 0, then, using the linear inde- 
pendence of the A^’x^, It follows that Oq = ••• = ®q -2 
= 0, so that X - aq_^A‘^~’xQ). Hence If x belongs 
to ^ belongs to " ^o 

consequently x = 0 . 

The sltmtlon now la this; R and R^ together 
span B and R^ contains the two disjoint linear mani¬ 
folds Rq and R o R,. If we let Rq be any com¬ 
plement of Rq ® ( R n R^ ) In R^, l.e. If 

^o ® *^o ® ^ ^ ^ then we may write 




164 


APPENDIX I 


fi fij 9 we assei>t that this R has the de¬ 

sired properties. In the first place R 
R Is disjoint from R n R^; It follows that 
R n R^ “ 0 . In the second place ^ ® R con¬ 

tains both R and R ^, so that R ® R = D . Finally 
R Is Invariant xmder A, since the fact that R ^ 

Implies that AR c= R^ c R ; q.e.d. 

Later we shall need the following remark. If x 

n— l/N* t r\f ^ 

Is any other vector for which f o, and if 9) is 

the linear manifold spanned by the vectors 
x^, and if, finally, S is any linear mani¬ 

fold which together with ^ coii 5 )letely reduces A, then 
the behaviour or A on Jo and ^ la the same as its 

behaviour on Jo and R respectively. (In other words: 

in spite of the apparent non uniqueness in the state¬ 
ment of Theorem 1, everything is uniquely determined up 
to Isomorphisms.) This follows immediately from our dis¬ 
cussion of quotient spaces — this in fact is the reason 
we Introduced them. 

Using Theorem 1 we can find a complete geometric 
characterization of nllpotent transformations. 


THEOREM 2. If A is a nllpotent linear 
transformation of index q on a finite di- 
mentlonal vector space D , then we can find 


a positive Integer r, r vectors x^, ..., x^, 
and r positive Integers q^, ..., q^ such 
that the vectors 






f^rm a basis for d , an( 


d such that 


0. The In¬ 


tegers r, are a complete set 

of Invariants of A; If, In other words, B 
Is any other nllpotent linear transformation 
on a finite dimensional vector space m then 
there Is an IsomoiTphlsm T between B and 
TGD for which TAT“^ « B If and only If B 
has the same r, q^, ..., q^ as A. 


PROOF. We write q^ = q, and we choose to be 

fiuiy vector for which A^^"^ ^ o. Prom Theorem 1 

we know that the l^ear manifold spanned by 
x^, Ax^, ..., A^^ x^ completely reduces A, or, In 
other words, that we may find a complementary manifold 
which also reduces A and which, naturally, has defi¬ 
nitely lower dimension than T . On this complementary 
manifold A la nllpotent of Index, say, q^; we apply to 
this manifold the same reduction procedure (beginning 
with a vector x^ for which A^^ ^ 0), and we con¬ 

tinue thus by Induction until we exhaust the space. 

This proves the existence we asserted; the uniqueness 
follows from the uniqueness (up to Isomorphlaraa) of the 
decomposition given by Theorem 1 • 

Using the basis the matrix of A takes on 

a particularly simple form; every matrix element not on 
the diagonal just below the main diagonal vanishes, 

(l.e. ^ 0 linplles j « 1 - 1 ), and the elements 

below the main diagonal begin (at top) with a string of 
1 ’s followed by a single o, then go on with another 
string of 1 *3 followed by a o, and continue so on to the 
end, with the lengths of the strings of l’s monotonely de¬ 
creasing (or, at any rate, non Increasing). 

It Is a soiind geometric Intuition that makes most 




166 


APPENDIX I 


people conjecture that for linear transfonnations being 
non singular and being in some sense zero are exactly 
opposite notions. Our disappointment in finding that 
7^(A) and 71 (A) need not be disjoint is connected 
with this conjecture. The situation can be stral^tened 
out by relaxing the sense in which we interpret "being 
zero"; for most practical purposes a linear transforma¬ 
tion some power of which is zero (i.e. a nilpotent trans¬ 
formation) is as zeroish as we can expect it to be. Al- 
thou^ we cannot say that a linear transformation is 
either non singular or "zero" even in the extended sense 
of zeroness, we can say how any transformation is made 
up of these two extreme kinds. 

THEOREM 5. Every linear transforma¬ 
tion A on a finite dimensional vector space 
73 la the direct sum of a nilpotent trans¬ 
formation and a non singular transformation. 


PROOF. We consider the null space of the k-th 
this is a linear manifold 71 


power of 

A: 

Clearly 


9lic “ 

^k+1 


71 (A^). 


’’a 

then 


We assert first: if ever 


n 




7\ 


Integers j. For If A'' ’ *'x 
whence (usln^ the fact that 
so that 


k+j 
0 then 


91, 


for all positive 

= 0 , 




91 


k+1 


) A^A«J 


X = 

X = 0. In other words 9ljj._^j Is con- 

^ k+j -1 ’ 


0 , 


tained In (and therefore equal to) 
on j establishes our assertion. 

Since 90 Is finite dimensional the manifolds 
cannot continue to Increase Indefinitely; we let 


the smallest positive Integer for which ^ q ■“ ^ 


It Is clear that 91 ^ 
does). We write 91, 


reduces 
- 91 (A^) 


A (In fact each 
for the range of 


q be 


(so that again It Is clear that IR reduces A); we 


shall prove that 91 ® 91 


90 


and that on 91^ 


Is nilpotent, whereas on 91 It Is non singular. 



THE CLASSICAL CANONICAL FORM 


111 

If X la a vector common to 1)1^ and 'R^ then 
A^x = 0 and x = A^y for some y. It follows that 
A^^y = 0, and hence, using the definition of q, that 
X = A% = 0. Thus we have shown that the range and the 
null space of A^ are disjoint; a dimensionality argu¬ 
ment (see §57) shows that they span T3 , so that !D is 
their direct sum. It follows from the definitions of q 
and 7\ ^ that A on 71^ la nilpotent of index q. 

If, finally, x is in (so that x = A^y for some 

y) and Ax = 0 then y = whence x = A^y = 0; 

this shows that A la non singular on and con¬ 

cludes the proof of Theorem 3. 

Once again we remind the reader that the considera¬ 
tion of quotient spaces shows that (up to isomorphisms) 
the behaviour of the nilpotent and non singular parts of 
A la uniquely determined by A. 

We can now use our results on nilpotent transforma¬ 
tions to study the structure of arbitrary transforma¬ 
tions. The method of getting a nilpotent transformation 
out of an arbitrary one may seem like a conjuring trick, 
but it la a very useful trick which is very often em¬ 
ployed. 

THEOREM 4. If A la any linear trans¬ 
formation on an n-dlmenslonal vector space 
D , let , ..., Ap be the distinct 
proper values of A, with respective multi¬ 
plicities m^, ..., iDp- Then !D la the 
direct sum of p linear manifolds 

jn^, ITRp, of dimensions m^, ..., nip 

such that each TRj reduces A and such 

that on TRy A has the form Bj + Aji 
where Bj la nilpotent. 

PROOF. Take any fixed j - 1. and consider 

the linear transformation Aj A - Ajl. To Aj wo may 


168 


APPENDIX I 


apply the decomposition of Theorem 3 to obtain linear 


manifolds and 71 . such that on 


1 


i 




is 


nllpotent and on 71 j It is non singular. Since TTl 
is invariant under Aj it is also invariant under 


i 


Aj + Aji A. Hence the detemlnant A (A - Ai) is 
(for every A) the product of the two corresponding de- 
teminants for the two linear transforaations that A 

separate- 
la A. 


becomes when we consider it on Wlj and 
ly. Since on W. the only proper value of A 

J 


j 


and since on ^ does not have the proper value Aj 

(A - Aji is non singular on 71 j) it follows that the 
dimension of 7/lj is exactly mj, and that for 1 3 * j 
and TTlj are disjoint. A dimension argument 
proves that ® .... ©571^-1) emd thereby con¬ 

cludes the proof of the theorem. 

We shall leave to the reader the details of putting 
together the results of Theorems 2 and 4; we shall mere¬ 
ly describe the final result in matrlcial language. 

Given any linear transformation A on a finite di¬ 
mensional vector space D there exists a basis I of 
73 such that with resjiect to this basis the matrix of 
A has the following form. Every element not on or im¬ 
mediately below the main diagonal vanishes. On the main 
diagonal there appear the distinct proper values of A, 
each a number of times equal to its multiplicity. Be¬ 
low any particular proper value A there appear only 
1 's and o's, and these in the following way: there are 
chains of 1 's followed by a single 0 , with the lengths 
of the chains decreasing as we read from top to bottom. 
This matrix is the .Tnrdan or classical canonical form 
of A; we have B - TAT~’ if and only If the classical 
canonical fonns of A and of B are the same except for 
the order of the proper values. (Thus, in particular, a 
linear transformation A has In some coordinate system 
a diagonal matrix If and only If Its classical canonical 
form Is already diagonal, l.e. If every chain of 1 'a has 




THE CLASSICAL CANONICAL POIW 


169 


length zero!) 

Let us Introduce some notation. Let A have p 
distinct proper values Ap with multiplicities 

m^, iDp as before; let the number of chains of 1's 

under A^ be rj, and let the lengths of these chains 
be Qj, - 1, q.„ -1, q. -l. The polynomial 

q J 

( \ ~ ^ 4 ) ^ is called an elementary divisor of A of 

multiplicity Qj ^ belonging to the proper value 
An elementary divisor Is called simple If Qj (so 

that the chain length qj ^ - 1 = 0); we see that a linear 
transformation A has (In a suitable coordinate system) 
a diagonal matrix If and only If the elementary divisors 
are simple. 

Theorem 4 does for arbitrary linear transformations 
what the spectral theorem did for normal ones. We make 
one application to exhibit the power of the theorem. 

The linear transformation By described In the state¬ 
ment, Is such that Bj - Ajl Is nllpotent of Index 
qj-, or. In other words, B. Is annulled by the polyno- 

V ^ 1 '1 1 ^ 

mlal ( A j - A ) J'. It follows that A Is annulled by 
the product of these polynomials (l.e. by the product 
of the elementary divisors of hipest multiplicities); 
this product Is called the minimal polynomial of A. It 
la quite easy to see (since the Index of nilpotence of 
Bj - Aj1 is exactly qj,)) that this polynomial Is In¬ 
deed uniquely determined (up to a multiplicative factor) 
as the polynomial of smallest degree which annuls A. 

Since the characteristic polynomial a(A - Ai) la the 
product of all the elanentary divisors and therefore a 
multiple of the minimal polynomial we obtain the 
Hamilton - Cavley equation ; every linear transfonnatlon 
la annulled by its characteristic polynomial. 



APPENDIX II 


DIRECT PRODUCTS 


In this appendix we shall describe a method of put¬ 
ting two vector spaces together to make a third, namely 
the formation of their direct product, TO = II © D 
Althou^ we had no occasion to make use of direct pro¬ 
ducts In this book their theory Is closely allied to some 
of the subjects we did treat and Is useful in other re¬ 
lated parts of mathematics, such as the theory of group 
representations and the tensor calculus. The notion is 
essentially more compllbated than that of direct sum; we 
shall therefore begin by giving some examples of what a 
direct product should be, and the study of these examples 
will guide us In laying down the definition. 

(1) Let U be the set of all polynomials x(3) In 
a real variable s; let be the set of all polynomials 
In another real variable t; and, finally, let TO be 
the set of all polynomials z(s,t) in the two real 
variables s and t. It , D , and ID are all vector 
spaces; In this particular case we should like to call 
DO , or something like It, the direct product of U and 
13 • One reason for this terminology Is that If we take 
any x In tL and any y in T , we may form their 
product x(s)y(t) = z(s,t). (This Is the ordinary nimier- 
ical product of two polynomials). Clearly z(s,t) Is 
an element of ID . (Here, as before, we are studiously 
Ignoring the Irrelevant fact that we may even multiply 
together elements of tl , l.e. that the product of two 
polynomials Is another one. Vector spaces In which a 
decent concept of multiplication Is defined are called 


170 



_ DIRECT PRODUCTS _ 171 

algebras and their study, as such, lies outside the scope 
of this book). . 

( 2 ) In the preceding example we considered vector 
spaces whose elements are functions of real variables. 

We may If we desire view the simple vector space as 

the collection of all functions defined on a set con¬ 
sisting of exactly n points, say the first n positive 
Integers. In other worlds a vector f 1^, ..., may 

be considered as a function f (1) of 1, defined for 
1=1, ..., n; the definition of the vector operations 
in Is such that they correspond. In the new nota¬ 

tion, to the ordinary nimierlcal operations performed on 
the functions ^(1). If, simultaneously, we consider 

as the collection of all functions, say n(j)> defined 
for j * 1, ..., m, then we should like the direct product 
of (5^ and be the set of all functions (;(l,j) de¬ 

fined for 1=1, ..., n, j = 1 , ..., m. The direct product. 
In other words. Is the collection of all functions defined 
on a set consisting of exactly mn points; this Is (i„ 

iUIi* 

This example brings out a property of direct products, 
namely the multlpllcatlvlty of dimension, that we should 
like to retain In the general case. 

Let us now try to abstract the most Important prop¬ 
erties of these examples. The definition of direct sum 
was one possible rlgorlzatlon of the crude Intuitive 
idea of writing down, formally, the sum of two vectors 
belonging to different vector spaces. Similarly our 
examples suggest that the direct product, tO = 11 ® D > 
of two vector spaces should be such that to every x In 
U and y in *0 there corresponds a "product", say z. 
In H) , z = X ® y. In such a way that the correspondence 
between x and z, for a fixed y, as well as the cor¬ 
respondence between y and z, for a fixed x. Is lin¬ 
ear. (I.e. ( a^x^ + ® y should be equal to 

a^(x^ ® y) + ttgCXg ® y), and a similar formula should 
hold for X ® ( ct ^y^ + a gyg)). In one word: x ® y 
should be a bilinear (vector valued) function of x and y. 




172 


APPENDIX II 


The notion of formal multiplication also suggests 
that If x' and y' are linear functionals on U. and 
D then It Is their product, x'(x)*y'(y), that should 
in some sense be the general element of TD'. This pro¬ 
duct Is a function z'(x,y) defined for x in U and 
y in ID, with the property that for each fixed value 
of one variable It Is a linear frmctlon of the other: In 
one word z'(x,y) is a bilinear (scalar valued) func-* 
tlonal of X and y. 

After one more word of explanation we shall be ready 
to give the deflnl'tlon. It turns out to be technically 
preferable to define not TD Itself, but DJ'; we shall 
use the reflexlvlty property ( TD” - ID ) to define TD . 
Since we have proved the validity of this property for 
finite dimensional spaces only, we shall frame the def¬ 
inition for such spaces; we merely remark that the def¬ 
inition may be used in any case (not necessarily finite 
dimensional) In which reflexlvlty has been proved. 

The preceding discussion was given to motivate the 
formal work we now begin. The actual definition Is not 
as terrifying as the Introduction might seem to make It. 

DEFINITION. If tL and B are any 
two finite dimensional vector spaces, we write 
ID' for the vector space of all bilinear func¬ 
tionals z'(x,y), defined for x In U. and 
y In B . The conjugate apace ID of ID’ la 
the direct product of \X and D , TD - 
IL a D . To each pair of vectors, x,y, 
with X In U and y In ID, we make oor- * 
respond a vector z In td » called the product 
of X and y, z - x ® y, defined by 
z(z') - z’(x,y) for every z' In IBil 

The notation of this definition la slightly out of 
tune with pin? preceding custom. ID' Is defined directly. 



DIRECT PRODUCTS 


m 


and not as the conj\igate apace of anTthlng, and doesn't 
therefore, merit the prime. ID on the other hand ^ the 
conjugate apace of something, namely 'TD', and should 
therefore he denoted by ( m' )' • 1^® situation la saved 

by the ref lexlvlty of ID’; since we have the natural Iso¬ 
morphism between Hj' and ( TD* Indeed be 

thou^t of as the conjiigate apace of 3D «■ ID", and 
since It Is the apace HD , and not Its conjugate, that 
we are primarily Interested In, we reserve the simpler 
notation for It. 

To get at the point as quickly as possible we slid 
over a little too llf^itly a part of the definition. It 
la probably clear that the set of all bilinear fvinctlon- 
als does fom a vector space, with the linear operations 
defined In a way which should by now be obvious to the 
readerj It takes however another minute's reflection to 
see that It la finite dimensional. (We referred In the 
preceding paragraph to the fact that It's reflexive.) 

A presentation of the detailed theory of bilinear func¬ 
tionals would take us too far afield, and would at the 
same time be a boring repetition of what we have already 
done for single linear functionals. We shall sketch, in 
the proof of Theorem i below, a few of the main facts. 

We are not going to go deeply Into the theory of 
direct products. The definition we gave la one of the 
quickest rigorous approaches to the theory, althoti^ It 
leads to some unpleasant technical difficulties later. 
Whatever Its disadvantages, however, we observe that It 
obviously has one of the desired properties: It la 
clear, namely, that z - x ® y depends linearly on 
each of Its factors. 

Another possible (and quite popular) definition of 
direct product Is by formal products: ID Is the set of 
all symbols of the form ® ^i^’ 

purist: Xj^ ® la supposed. In this definition, to 
stand for the pair |Xj^,yj^|; the multiplication sign Is 



APPENDIX II 


2Jh 

merely a reminder of what to expect •) Neither defini¬ 
tion Is simple; we adopted the one we gave because It 
seemed more constructive* 

We prove now the two fundamental theorems which had 
better be true, and which serve. In part, as further 
justification of the product terminology. 

THEOREM 1. The dimension of the dir¬ 
ect product, Hj = U ® D , of two finite 
dimensional vector spaces Is the product of 
their dimensions. 

PROOF. Since 133 Is defined as the conjugate space 
of an auxlllai*y space ID’, it Is sufficient to prove 
that If U Is n-dlmenslonal and 13 Is m-dimensional 
then the dimension of ID* Is nm. We sketch the steps. 

(1) If X “ • • • f ) and ^ » (y^> •••# ) 

are any two bases In U and D respectively, and if 

scalars, then there Is one and only one bilinear fvmc- 
tlonal zMx,y)(l*e. there Is one and only one element 
z’ In ID*) for which z’(x^,yj) « “ir Theorem 1, 

5l4). For If X « 1 1^1 y “ ^j^j 

(1) z'(x,y) » ^1 

so that z' Is clearly imlquely detennlned by the pre¬ 
scribed conditions; that any z’ Is determined at all 
follows by writing z'(Xj»yj) “ a and reading for¬ 
mula (1) from rlg^t to left, l.e. defining z' by It. 

(2) Using the result just obtained we define 

z'pq(x»y), for p = 1, n,q - 1, m, by 

z'pq(Xi»yj) - d^p dj , (see Theorem 2, 5li»); the 
are a basis In U)'. ^ey are linearly Independent, 
since 

H p Z!q “pq^pq " ° 


Inqplles 



DIRECT PRODUCTS 


115 


“T n p2:<,»p, "ip ‘jq- “ij; 

and if z'( 3 C»y) is any element of ID', with z'(x^,yj) 
>= then Zpq(x,y) - |p tig* so that (substituting 

into (1)). 

(2) z'(x,y) = I pZqZ^qU,y)apq. 

It follows that the are a basis in *; this com¬ 

pletes the proof of the theorem. 


THEOREM 2. If X » (x^, x^) 

and g « (y^, y^^^) are bases in 

U and D respectively, then the set 
3 of vectors ® y^, 1 = i, ...,n, 

j ** 1, ..., m, is a basis in TO = U ® T 3 


PROOF. In the proof of the preceding theorem we 


defined a basis (namely the z^j) in l©t 

= (TO* )i be the dual basis. We assert that 


Zij, in 


ID 

z 


To prove this assertion we recall that the 


J-i ij* 

Zj^j are linear f\mctlonal3 on XD' satisfying the condi¬ 
tions 

a. 




‘*lp 


Hence 


Zij(z-) 


IZlj,Z 


“ ^p^q“pq^^lj’^pq^ 

” “ij “ • 


Remembering now the definition of the symbol x S) ji we 
see that z^j ” ^i ® 

For the pui^pose of giving examples later we consider 
the spaces and Pj^ (see 52 ,(>>)). We leave it as 

an exei;clse for the reader to prove that their product 
6 “ Pn®^^m^® isomorphic to the space of all 

polynomials z(s,t) in two real variables, with the 
property that for each fixed x, z(s,t) la of degree 



Hi 


APPENDIX II 


^ m In t, and for each fixed t, z(a,t) Is of degree 
^ n In In such a way that the direct product x o y 
of X - x(8) and y - y(t) corresponds to the ordinary 
product z - z(8,t) - x(a)y(t). 

Let us now try to tie up direct products with linear 
transfomations. Let tt and ID be n and m dlmen- 
alonal vector spaces, and let A and B be any two 
linear transf omations on tt and ID respectively. Let 
TD •» tt ® D be the direct product of tt and ID ; 
we define a linear transformation C on ID , called the 
direct product of A and B, C ■> A O B, as follows. 

We first define a linear transforaatlon C on ID^ by 
the relation 

C'z'(x,y) - z'(Ax,By); 

we may then define C as the adjoint of C, C ■■ (C')'> 
or. In symbols. 

C(z[z*(x,y)]) - z[z'(Ax,By)3. 

If, In partlciilar, we apply C to an element z^ of 
the fom Zq - » y^ (l.e. for 

every z' In ffi’) we obtain Cz^ - Zq[z'(AXq,^q)], l.e. 

(3) Cz^ - AXq ® ®yo- 

Since there are qiilte a few elements In D) of the fom 
X ® y, enough at any rate to fom a basis, (Theorem 2,), 
this last relation (diaracterlzes C. 

The fomal rules for operating with direct products 
are the following: 

(4) 0 « A - A o 0 - 0, 

( 3 ) 1 ® 1 - 1 , 

(6) (A, + Ag) » B - (A, © B) + (Ag ® B), 

(7) A ® (B, + Bg) - (A ® B,) + (A ® Bg), 

(8) aA ® (JB - a(3(A ® B), 

(9) (A « B)-’ - A"’ ® B"\ 

(10) (A/g) ® (B^Bg) - (A, 0 B, )(Ag O Bg). 



DIRECT PR0D0CT3 


177 


Foimula ( 9 ), like all fonmilae involving Inverses, 
has to be read with caution: we assert that the exist¬ 
ence of A'^ and B”' Implies the existence of 
(A ® B)~\ and the validity of ( 7 ); conversely. In a 
finite dimensional since, the existence of (A ® B)~^ 
Implies that of A" ^ and B~'. We shall jpvove (9 ) and 

(10) , in reverse order. 

Formula ( 10 ) follows from the characterization ( 3 ) 
of direct products and the following coniputation: 

(A^Ag « B^B 2 )(x « y) =■ (A^AgX) ® (B^Bgy) 

= (A, 9 B^)(A 2 X ® Bgy) - (A, ® B^)(Ag © Bg)(xC y). 

As an imnedlate consequence of ( 10 ) we obtain 

(11) A « B = (A » 1 )(1 ® B) - (1 ® B)(A © 1 ). 

To prove ( 9 ), suppose that A"^ and B"^ exist, 

and form A ® B and A"^ © B~\ Since, by (10), the 
product of those two transformations. In either order. 

Is 1, A ® B has an Inverse and this inverse Is equal 
A”^ ® B~^. Conversely suppose that (A B B)”^ exists; 
remembering that we defined direct product for finite 
dimensional spaces only, we may Invoke Theorem 2 of §24: 
we shall show that Ax 0 Implies that x - 0, and 
By = 0 implies that y 0. We use (1): Ax ® By *» 

(A ® B)(x B y). If either factor on the left Is zero 
then (A ® B)(x By) - 0, whence x ® y - 0 , so that 
either x — 0 or y — 0 , Since (by (4)) ^0 is Impos¬ 
sible, we may find a vector y for which By 0. Ap¬ 
plying the above argument to this y, together with any 
X for which Ax - 0 , It follows that x - 0 . The same 
argument with the roles of A and B Interchanged 
proves that B has an Inverse. 

For an Interesting example of the direct product of 
two transfonnatlons we take 11 and B to be P and 
P jjj, and A and B to be differentiation on p 
and Pju respectively. Since the space 11 © D may be 
thought of as a space of polynomials In two variables. 


178 


APPENDIX II 


the direct product C A 8 B, applied to z(s,t), 

yields the mixed partial derivative Cz - • Pwjof? 

An Interesting (and complicated) side of the theory 
of direct products Is the theory of Kronecker products 
of matrices. Let I = (x^, x^^) and 

“0 “ •••» y^) l3e bases In tl and B , and let 

tA] - [A; I ] “ ( oe^j) and [A] - [B; g ] = ( (?pq) be 
the matrices of A and B. What Is the matrix of 
A 8 B In the coordinate system fXj^ 8* yp i ? 

To answer this question we must recall the discus¬ 
sion in §2^ concerning the arrangement of a basis in 
some linear order. Since, unfortunately. It la Impos¬ 
sible to write down a matrix without committing one’s 
self to an order of rows and columns, we shall be frank 
about It, and arrange the n times m vectors x^^ 8 yp 
in the so called lexicographical order, as follows: 

X, 8 y,,x^ 8 1 ^, ...» X^ 8 y^,kg 8 y,, ..., 

*2 ® yin» •••» *n ® ^1' •••» ^ ® ^m* 

We may also carry out the following congmtatlan: 

(12) (A 8 B)(Xj 8 yq) - AXj 8 By^ 

“ (Zi“ifi ® (ZpPpqyp) 
i^p“ij ® 

This process Indicates exactly how far we can get with¬ 
out ordering the basis elements; If, for example, we 
agree to Index the elements of a matrix not with a pair 
of Integers but with a pair of pairs, (l,p) and (J,q), 
then we know now that the element In the (l,p)-th row 
(J,q)-th column Is (3pq. If wo use the lexico¬ 
graphical ordering, the matrix of A 8 B has the form: 


DIRECT PRODUCTS 


179 



In a condensed notation whose meaning la clear we may 
write this matrix as 


( 1 *^) 


... 

... »j^[Bl 


“n.«I 




• • • 


QC 

nn 


[B] 


This matrix Is known as the Kronecker product of [A] and 
[B], In this order. The rule for forming It Is easy to 
describe In words: replace each element of the n 

times n matrix [A] by the m times m matrix 
If In this rule we Interchange the roles of A and B 
(and consequently Interchange n amd m) we obtain the 
definition of the Kronecker product of [B] and [A]. 
Query: Is there an arrangement of the basis vectors 
Xi 9 yp, such that the matrix of A ® B, referred to 
the arranged coordinate system. Is the Kronecker product 
of [B] and [A]? 

Let us now attempt to Introduce a sensible Inner 
product Into the direct product of two unitary ax)acea. 

It Is technically easier to define inner product not In 
03 “ tL ® B But In the avixlllary space 12 and then 

to apply the general theory of the conjugate space of a 



180 


APPENDIX II 


unitary space to find an Inner product In B). 

If z' >■ z'(x,y) Is any element of ID', z' may 
be written as a sum of products of the form x'(x)y'(y), 
where x' and y* are linear functionals on U. and “0 
respectively. Since VL and D are unitary spaces this 
iDqplles that z'(x,y) may be written as a finite s\m of 
expressions of the form (x,Xq) ( 7 , 7 ^); say 

( 15 ) z'(x,y) - 

%nce If z| and z| are any two elements of ID ' we 
may write 

(16) z{(x,y) - )(y»yn)» 

(17) z^(x,y) - 52 

and we may define 

(z{,z*) =i:iZjUj2'*ii)(yj2»yii>- 
(The conjugate nature of the relation between vectors and 
linear functionals on a unitary space again necessitates 
putting Xjg before Xj^,). Before we can even start to 
prove that this definition fulfills the conditions of the 
definition of an Inner product, we must prove that It 
defines (z|,z^) Independently of the representations of 
z| and z^ as sums In ( 16 ) and (17)> To do this we ob¬ 
serve that 

^j<*j2»*ii )<yj2»yii > - 

so that (zj,z|) ** H ^ of the 

particular representation of z^. Since, moreover. In 
any given representations of zj and z^ It Is true that 
(zj,z^) - (z^,zj), It follows that (z{,z^) Is also inde¬ 
pendent of the representation of z|. 

It Is easy to verify that the expression (z{,zp la 
linear In zj, conjugate linear in z^, and Heznltlan 
syamstrle. The non trlvleJ. part of this dlaouaslon Is 
the proof that It Is also positive definite, l.e. that 
(z',z') > 0 for all z' y 0 . 



DTREiCT PR0DPCT3 


l£l 


Using the r^prasentatlon (1) we have 

(z»,z') - 51illj(*j»xi)(yj»yi). 

For any complex nimibera 11 j[ I we have 

Jj fj - ( 5;j ijXj, ii*i) " II Xj i 

so that the matrix whose general element Is (Xj,x^) Is 
non negative. Similarly of coTirse the matrix idiose gen¬ 
eral element Is (yj,yj^) Is non negative; It follows 
from Theorem 2 of $69 that the sane Is true of the Hada- 
mard product (Xj,Xj^)(yj,yj^)). Hence 

^1 ^ 0 
for every set of con^lex f’s; choosing " 1 for all 
1, we see that (z',z') ^ o for all z'. 

In order to prove that (z', z') - o Ij^plles z' - o 
we proceed as follows. For the expression (z|,z^), 
idilch by now has all other properties of an Inner product, 
we may prove Schwarz's Inequality just as In 

^ (z',z')(z',z4). 

It follows that the vanishing of (zj,zp Ijqplles the 
vanishing of (z|,z^) for all z^. Lot x^ and y^ be 
eu?bltrary vectors and take In partlouleu* 

z^ - z^(x,y) - (x,Xo)(y,yo). 

The vanishing of (z|,z|) loqplles that 

0 - (z',z^) - Ei(Xo>*ii)<Vyii^ " 
hence z^(xQ>yo) ■ o for all x^ and y^, so that z{ - o 
This concludes the Intivduetlon of an inne r product 
In TQ' — we denote the unitary space so obtained by 
O *. Applying the results of S50 wo obtain an Inne r 
product In the conjugate space TO of TD*. 

It la now easy to prove that the Inner product de¬ 
fined in ID haa the property that 

(x^ e yt» *a • ^2^ “ (Xj0Xg)(7j,jg). 




182 


APPENDIX II 


We write Zg - We coiisldeF also 

the elements z^ and z| of ID * defined by 

(x,y) = (x,x, )(y,y,), 

z| (x,y) - (x^XgXy^yg), 

and we define the elements z^ and Zg of ID by 
z, - z,(z*) ■» (z*,z*), 

Zg - Z 2 (z*) •= (z*,z*). 

For an arbitrary z* ■ z* (x,y) In TD*, with the rep¬ 
resentation 

zjCxjy) - Zi(*.Xi)(y»yi) 

we have 

Zi(z*)- Zi(Xi»3c°)(y,,y^), 

Z2(z*) - 

whence 

Zi(z 5 ) - z*(x,,y,) - z^{z*), 
z^{z*) - z*(x2,yg) - Zg(z*). 

(This Is very similar to the proof In §52 of the equality 
of the two natviral correspondences between a unitary 
space and Its conjugate). Hence we have, finally, 

(z,,Zg) - (z,,Zg) - (z*,z*) - (x^,Xg)(y,,yg), 
as was to be proved. 

The last proved fact Justifies once more the direct 
product temlnology and descidbes completely the struc¬ 
ture of ID and Its relation to U. and D . It follows 
also that If {x^l and {yp| are orthogonal bases In 
U. and D respectively then 

(xj ® yp, xj ® yq) - (Xj^,Xj)(yp,y^) - 6 ^^ 6 ^^, 

do that the {x^^ ® yp| fora an oirthonoraal set In ID • 
Since we have already seen that the; fora a linear basis. 
It follows that they are a complete orthonoraal set, or 
an orthogoxial basis. In TB. 




APPENDIX III 


HILBERT SPACE 


Probably the most useful and certainly the best 
developed generalization of the theory of finite dimen¬ 
sional unitary spaces to infinite dimensions is the 
theory of Hilbert apace. Without going into details and 
entirely without proofs we shall now attempt to indicate 
how this generalization proceeds and udiat are the main 
difficulties which have to be overcome. 

The definition of Hilbert space is easy: it is an 
infinite dimensional unitary apace satisfying one extra 
condition. That this condition — namely completeness — 
la automatically satisfied in the finite dimensional case 
is proved in elementary analysis. In the infinite dimen¬ 
sional case it may be possible that for a sequence fx^l 
of vectors H ^ ^ ^ ^ ^ ^ still 

there la no vector x for which || x - x^ II —♦ 0 ; the 
only effective way of ruling out this possibility is ex¬ 
plicitly to asstime its opposite. In other words: a 
Hilbert space is a complete infinite dimensional imitary 
space. (Sometimes the concept of Hilbert space is re¬ 
stricted by an additional cardinal number condition — 
namely separability — but in recent years, ever since 
the realization that this additional restriction doesn’t 
I>ay for itself in results, it has become customary to 
use "Hilbert space" for the concept we defined.) 

It is easy to see that the space p of polynomials 
with the inner product (x,y) * ^(t)yrt7dt is not 
complete. In connection vrlth the notion of completeness 


183 



HTTJRKRT SPACE 


184 

of certain particular Hilbert spaces there Is a quite ex¬ 
tensive mathematical lore: the main assertion of the 
celebrated Rlesz - Fischer theorem Is that the space 7 

of all functions x(t) for which |x(t)|^ Is Lebesgue 

* 

Integrable In the Interval (0,1) Is (with the same | 
formal definition of Inner product as for polynomials) a ; 
Hilbert space. Another popular Hilbert space, reminis¬ 
cent In Its appearance of finite dimensional Euclidean 
space. Is the space 6 of all sequences I inf 
plex numbers for which 51^1 Ij^l converges. 

Using completeness In order to discuss Intelligently 
the convergence of some Infinite series one can proceed 
for quite some time In building the theory of Hilbert 
spaces without meeting any difficulties due to Inf ini te 
dimensionality. Thus our proof of Schwarz's Inequality 
Is valid In the most general case and the notions of 
orthogonality and complete orthonormal sets can be de¬ 
fined exactly as we defined them. Even our proof of 
Bessel's Inequality and of the equivalence of the various 
possible formulations of completeness for an orthonomal 
set have to undergo only slight verbal changes. (The 
convergence of the various Infinite series that enter Is 
an automatic consequence of Bessel's Inequality). Final¬ 
ly the pz>oof of the existence of costplete orthonomal 
sets parallels closely the proof in the finite ease; In 
the unconstructlve proof transflnlte Induction (or Zorn's 
lernna) replaces ordinary Induction, and even the con¬ 
structive steps of the Oram - Schmidt process are easily 
carried out. 

In the discussion of manifolds, functionals, and 
transformations the situation becomes uncomfortable if we 
do not make a conoesslon to the topology of Hilbert space. 
(3ood analogs of all our statements for the finite dimen¬ 
sional case can be proved if we consider closed linear 
manifolds, continuous linear functionals, and 
linear transformations. (Ih a finite dimensional space 




_ APPENDIX III _ 185 

every linear manifold la closed, every linear functional 
la continuous, and every linear transformation la 
bounded.) If, however, we do agree to make these con¬ 
cessions then once more we can coast on our finite dimen¬ 
sional proofs without any change most of the time and 
with only the Insertion of an occasional c the rest of 
the time. Thus once more we obtain that D « 7/1 © 
and 7f\ «■ Ttt^^ and that every linear functional of x 
has the form ix,y); our definitions of Hennltlan and 
non negative tranafonnatlons still make sense, and all 
our theorems about perpendicular projections (as well as 
their proofs) carry over without change. 

The first hint of how things can go wrong comes 
from the study of unitary transformations. We still call 
a transfoiroatlon U unitary If UU* » U*U « 1, and It 
Is still true that a unitary transformation Is Isometric, 
l.e. || Ux II « II X II for all x, or equivalently (Ux,Uy) 
« (x,y) for all x and y. It Is, however, easy to 
construct an Isometric transformation which Is not uni¬ 
tary; because of Its Importance In the construction of 
counter examples we shall describe one such transforma¬ 
tion. We consider a Hilbert space In which there Is a 
countable complete orthonormal set, say x^,x^, ... . 

A unique bounded linear transformation U Is defined by 
the conditions Ux^^ = x^^-j for n » 1,2, ... ; this U 
Is Isometric but not unitary since U*U « l, but UU*x^ « 
0. The theory of Cayley transforms Is ai*fected by the 
distinction between unitary and Isometric, but only to a 
comparatively sli^t extent — we omit the description 
of the differences. 

It Is when we come to the spectral theory that the 
whole flavor of the development changes radically. The 
definition of proper value as a number A for which 
Ax » AX has a solution x ^ o still makes sense and 
Is still useful in many contexts, and our theorem about 
the reality of the proper values of a Hennltlan operator 




186 


HILBERT SPACE 


la still true. The notion of proper value loses, howeveij 
much of Its significance. Proper values are so very use¬ 
ful In the finite dimensional case because they are a 
handy way of describing the fact that something goes 
wrong with the Inverse of A - A i, and the only thing 
that can go wrong Is that the Inverse does not exist. 
Essentially different things can happen In the Infinite 
dimensional case; just to Illustrate the possibilities 
we mention that, for example, the Inverse of A - Ai 
may exist but be unbounded. That there Is no useful gen¬ 
eralization of determinant, and hence of the character¬ 
istic equation. Is the least of our worries. The whole 
theory has. In fact, attained Its full beauty and matui*- 
Ity only after the slavish Imitation of such finite di¬ 
mensional methods was given up. 

After some appreciation of the fact that the In¬ 
finite dimensional case has to overcome great difficul¬ 
ties, It comes as a pleasant surprise that the spectral 
theorem for Hermit Ian (and even for normal) operators 
does have a very beautiful and powerful analog. (Al- 
thou^ we describe the theorem for bounded operators only 
there Is a large class of unbounded ones for which It Is 
valid). In order to be able to understand the analogy 
let us re-examine the finite dimensional case. 

Let A be a Heimltlan linear transformation on a 
finite dimensional vector apace and let A « AjP, 

be Its spectral form. Let us write 

where the sumnatlon Is extended over those values of J 
for which A. < A . It Is clear that, for each A , E( A ) 
Is a (perpenucular) projection and It la easy to see 
that the characteristic properties of the E(A ) are 
the following. 

(1) For A < ►! , E( A ) ^ E( M ). 

( 2 ) For A sufficiently large E( A ) ■■ i and 
E{- A ) « 0. 

(3) AE( A) » E( A )A for all A. 




APPENDIX III 


187 


(1>) Por.«> 0 aijfficlently small, E( A- e) - E( A). 

(5) A - AjtE( Aj^,) - E( Aj)]. 

Those familiar with Stleltjes Integration will recognize 
the sum In ( 5 ) as a typical approximating sum to an in¬ 
tegral of the form ^ A A ) and will therefore see 
how one may expect the generalization to go. The exact 
statements (U) and ( 5 ) are replaced by the limiting 
statements, 

(4') llm^_^0 E( A- 6) - E( A), € > O; 

( 5 ') A - AdE( A ). 

Except for this obvious alteration the spectral theorem 
remains true In Hilbert space. We have, of course, to 
Interpret correctly the meaning of the limiting opera¬ 
tions, Once more we are faced with the three possibili¬ 
ties we mentioned In §75 (called imlfom, strong, and 
weak convergence respectively); It turns out that (4*) Is 
to be given the strong and (5 *) the uniform Interpreta¬ 
tion. (The reader deduces of course from our language 
that the three possibilities are Indeed distinct In 
Hilbert space.) 

We have seen that the projections entering 

Into the spectral form of A In the finite dimensional 
case are very simple functions of A. (§ 65 ). Since the 
E( A) are obtained from the Pj by summation they also 
are functions of A, and It Is quite easy to describe 
what functions. We write ( t* ) =* l If ^ ^ ^ind 
(t ) « 0 otherwise; then E( A ) = g^ (A). This fact 
gives the main clue to the proof of the spectral theorem 
In Hilbert space. The usual process Is to discuss the 
functional calculus for polynomials end by limiting pro¬ 
cesses to extend It to functions of Balre class 1 , (l.e. 
to such functions as i ^ )•) Once this Is done we 
may write, for any given A, E( A ) * (A) by defini¬ 

tion; there Is no particular difficulty In the proof of 
assertions (l), ( 2 ), (5), (4’)y (5*). 

After the spectral theorem Is proved It Is easy to 


188 


HILBERT SPACE 


deduce from It the analogs of our theorems concerning 
sqiaare roots, the general functional calculus, the polar 
decomposition, and properties of commutativity, and in 
fact to answer practically every askable question con¬ 
cerning bounded normal operators. 

The chief difficulties that remain are the consid¬ 
erations of non normal and of unbounded operators. Con¬ 
cerning non normal transformations it is easy to describe 
the state of our knowledge — It is non existent. No 
even unsatisfactory analog exists of the superdiagonal 
form or of the Jordan canonical form and the theory of 
elementary divisors. Very different is the situation 
concerning normal (and particularly Hermitlan) unbounded 
transformations. (The reader will sympathize with the 
desire to treat such transformations if he recalls that 
the first and most important functional operation that 
moat of us learn is differentiation.) In this connection 
we shall barely hint at the main obstacle the theory 
faces. It is not very difficult to show that if a Her- 
mitian linear transformation A is defined for all vec¬ 
tors of Hilbert space then it is boimded. In other 
words the first requirement concerning transformations 
that we are forced to give up is that they be defined 
everywhere. The discussion of the precise domain on 
which a Hermitian transformation may be defined and of 
the extent to which this domain may be enlarged is the 
chief new difficulty encountered in the study of un¬ 
bounded operators — for the details we invite the reader 
to consult the texts listed in the bibliography. 




BIBLIOCffiAPHy 


ALBERT, A. A., Modem Hlrfier Algebra , Chicago, 1937 . 

BANACH, S., Th^orle dea Operations Llndalrea . Warszawa, 
1932. 

COURANT, R. and HILBERT, D., Methoden der Mathematlachen 
Phyalk . volume 1, Berlin, 1951 . 

FRAZER, R. A., DUNCAN, W. J., and COLLAR, A. R., Ele ¬ 
mentary Matrices . Cambridge, 1938 . 

HASSE, H., Hohere Algebra , volume i, Berlin, 1933. 

MacDUPPEE, C. C., The Theory of Matrices . Berlin, 1933. 

MURNAG0HAN, P. D., The Theory of Group Representations . 
Baltimore, 1938 . 

von NEUMANN, J., Mathematlache Grundlagen der Quanten - 
mechanUc . Berlin, 1932 . 

SCHREIER, 0. and SPEE?NER, E., Vorlesimgen uber Matrlzen . 
Leipzig, 1932 . 

STONE, M. H., Linear Transformations In Hilbert Space . 
New York, 1932 . 

van der WAERDEN, B. L., Modeme Algebra , two voliunea, 
Berlin, 1937 - 19 ^^o. 

WEDDERBURN, J. H. M., Lectures on Matrices . New York, 
193 * 1 . 

WEYL, H., The Classical Groups . Princeton, 1939. 

WINTNER, A., Spektraltheorle der Uhendllchen Matrlzen. 
Leipzig, 1929 . 


189 



LIST OP NOTATIONS 


Throughout this hook we have observed, whenever 
possible, the following conventions. Lower case Greek 
letters, a, ft, "y , A, |a, v, a, t, stand for 
scalars in general and real and complex numbers in par¬ 
ticular; lower case Latin letters around the middle of 
the alphabet, 1, j,k,m,n,p,q,r, are used for positive 
Integers, and those at the end of the alphabet, x,y,z,u, 
V, for vectors and linear functionals. Capital Latin 
letters, A,B,C,E,P,G,S,T,U,V, denote linear transforma¬ 
tions and capital Gennan letters, ^ , R , TTl , , 

^ , Q, 3l,G,U stand for 

vector spaces, linear manifolds, and sets of vectors in 
general. The most notable violations of these rules 
are our occasional use of s,t for real variables, f,g, 
for polynomials and other functions, T for 
trace, R and I for real and imaginary part, and 1 
for The capital Greek letters ^ and TT are 

reserved as usual for addition and multiplication. In 
general we indicate simimatlon by a symbol such as 
51 ^ “ij^i^ this is to be interpreted as summation 
over the entire range of the index 1 of • Only 
when we depart from this convention do we use such an ex¬ 
pression as YL 7y 

Superscripts, as in A^, A^, generally stand for 
exponents; on some rare occasions (notably in § 69 ) when 
the alphabet is nearly exhausted we use them merely as 
indices. 

The braces ! ... I are generally used to denote a 
set; thus I may be the coordinates of a 

vector, and Ix^^l may stand for a finite or infinite 


190 



LIST OF NOTATIONS 


1 


sequence of vectors. To this rule there Is one Im¬ 
portant exception; when we wish explicitly to write out 
a finite set of vectors we use the parentheses 
(x^, x^) Instead of the braces in order to empha¬ 

size the fact that we are not discussing coordinates. 

We use the double arrow for implication and 

4 =:^ for equivalence (l.e. implication in both direc¬ 
tions ); the simple arrow —> denotes the effect of a 
mapping or, sometimes, the convergence of a sequence — 
the correct Interpretation will always be clear from 
the context. 

We adjoin a list of some of the other symbols we 
used; the niombers refer to sections and Roman numerals 
to the appendices. 



24 

A* 

32 

[A] 

25 

[A; I ] 

25 

A* 

51 

D 

20 

^Ij 

6 

A 

39 

^n 

2 

inf 

71 


1 0 

J 

20 

TJl 

16 

^ _ A. 


46 

71 (A) 

56 

V (A) 

57 

0 

9 

“P 

2 

P„ 

57 

771 


2 

71(A) 

56 


2 


192 


LIST OP NOTATIONS 


p{A) 37 

sup 71 

T 20 

TJ ' 12 

D* 50 

n 9 

<= 9 

[ , ] 13 

( , ) 44 

e 17 

(S II 

^ 56 

II ... II 44 

III •••IB 71 

{•••:...! 71 





INDEX OP DEFINITIONS 


(Numbers refer to sections; Roman numerals to 
appendices.) 

adjoint 32 

annlhllator 16 

antl-lsomorphlsm 32 

basis, linear 6 

orthogonal 48 

bilinear form 53 

bilinear functional 13 

bound 71 

Cartesian decomposition 54 

Cayley transform 6l 

characteristic polynomial 39 

cogredlent 3^ 

complete orthonormal set 46 

complete reduclblllty 28 

complete space III 

conjugate Isomorphism 50 

conjugate space 12 

contragredlent 5^ 

c ont ra variant 5 ^ 

convergence, transformations 75 
vectors 70 

coordinate system, linear 6 

orthogonal 48 

covariant 5^ 

deteimlnant 59 


195 



INDEX OP DEFINITIONS 


19 *> 


diagonal matrix 63 

dimension, linear 7 

orthogonal 46 

direct product II 

direct sum, spaces 17 

transfoimatlons 28 

disjoint 9 

distance 43 

divisor of zero 23 

dual basis 1 4 

elementary divisor I 

finite dimensional 6 

fxmctlons of operators 65 

Hadamard product 69 

Hamilton - Cayley eqiaatlon I 

Hermit Ian 54 

Idempotent 29 

Inner product 44 

Invariant 27 

Inverse 24 

Involution 31 

Isometric III 

Isomorphism 6 

Kronecker delta 6 

Kronecker product II 

linear combination 5 

linear dependence 4 

linear functional 12 

linear manifold 9 

linear transformation 20 

length 44 

lower boimd 73 




INDEX OP DEFINITIONS 


195 


matrix 25 

minimal polynomial I 

multiplicity, algebraic 4o 

geometric 4o 

of a proper value 59 

of an elementary divisor I 

natural correspondence 15 

nllpotent I 

non negative 56 

non singular 24 

nom 44 

normal transformation 64 

null space 36 

nullity 37 

operator 20 

origin 1 

orthogonal, complement 46 

manifolds 46 

projections 57 

transfoimatlons 60 

vectors 46 

orthogonallzatlon 48 

orthonormal 46 

perpendicular projection 57 

polar decomposition 67 

polarization 53 

positive definite 56 

projection 29 

proper value 59 

proper vector 39 

quadratic form 53 

quotient si)ace I 

range 36 




196 


ISDEX OP DEPINITIQNS 


rank 

reduclblllty 

reflexive 


37 

27 

15 


scalar l 

simple, elementary divisor I 


proper value 

39 

singular 

24 

span 

10 

spectrum 

39 

square root 

66 

super diagonal fom 

41 

symnetrlc 

60 

trace 

40 

unit sidiere 

53 

unitary space 

44 

unitary transformation 

59 

upper bound 

73 

vector space 

1 




198 


INDEX OF TERMS 


Hermitian symmetry, 122 
transformation, 135 
Hilbert space, 189 
Hochschild, G. P., 198 
Homogeneous functional, 20 
transformation, 55 

Idempotent, 73 
Identity, 43 
Image, 88 

Imaginary part of a complex number,126 
part of a transformation, 137 
Independent, 32 
Index, 109 
Infimum, 176 
Inner product, 119, 121 
product space, 121 
Intersection, 17 
Internal direct sum, 30 
Invariant, 71 

Inverse of a permutation, 43 
of a transformation, 62 
Invertible, 62 
Involution, 78 
Isometric matrix, 144 
Isometry, 143 
Isomorphism, 14 

Jordan form, 114 

Kronecker product, 98 

Lattice, 142 
Law of nullity, 92 
Left inverse, 64 
Length, 121 
Linear combination, 9 
dependence, 7 
functional, 20 
independence, 7 
manifold, 16 
transformation, 55, 57 
Lower bound, 180 

Matrix, 65 

Minimal polynomial, 114 
Minimax principle, 181 
Module, 5 

Multilinear form, 48 
Multiplicity, 103 
algebraic, 104 


Multiplicity (Cont.): 
geometric, 104 
of elementary divisors, 114 

Neumann series, 186 
Nilpotent, 109 
Non-degenerate, 37 
Non-singular, 99 
Norm, 121 

of a linear transformation, 176 
Normal, 159 

Normed vector space, 126 
Nullity, 90 
Null-space, 88 
Numerical range, 162 

Odd permutation, 47 
polynomial, 19 
Operator, 55 

Order of a permutation, 46 
Origin, 3 

Orthogonal vectors, 122 
complement, 123 
dimension, 122 
equivalence, 158 
projections, 147 
transformation, 142 
Orthonormal, 122 
basis, 128 

Parity, 47 

Parsevars identity, 124 
Partial isometry, 150 
Permutation, 42 
Perpendicular projection, 146 
Polar decomposition, 170 
Polarization, 138 
Positive definite form, 122 
transformation, 140 
matrix, 141 
number, 2 

Principal minor, 167 
Product of linear transformations, 58 
of permutations, 42 
Projection, 73 
Proper value, 102 
vector, 102 

Quadratic form, 38 
(^otient space, 34 



INDEX OF TERMS 


199 


Range, 88 
Rank, 90 

Rational vector space, 4 
Real part of a complex number, 126 
part of a transformation, 137 
transformation, 137 
vector space, 4 
Reducibility, 72 
Reflexivity, 25, 28 
Riesz-Fischer theorem, 189 
Right inverse, 64 
Row, 65 
rank, 91 

Scalar, 1 
product, 119 

Schwarz^s inequality, 125 
Self-adjoint, 135 
Signum, 47 
Similar, 85 

Simple elementary divisor, 114 
proper value, 103 
Singular, 99 
Skew Hermitian, 136 
-symmetric form, 50 
-symmetric transformation, 136 
Span, 17 

Spectral form, 156 
radius, 182 
theorem, 156 


Spectrum, 103 
Strictly positive, 140 
Subgroup, 47 
Subspace, 16 
Supremum, 176 
Sylvester^s law of nullity, 92 
Symmetric bilinear form, 38 
form, 49 
group, 44 

transformation, 135 

Tensor product of spaces, 40 
product of transformations, 95 
Trace, 105, 109 
Transpose, 81 
Transposition, 44 
Triangular form, 107 

Unitary equivalence, 158 
space, 121 
transformation, 142 
Upper bound, 180 

Vector, 3 
space, 3 

Well-ordering, 13 
Zorn’s lemma, 13 



INDEX OF SYMBOLS 


a*, 47 
[A], 65 
A-\ 62 
A', 79 
[A; SC], 65 
A*, 132 
A+, 151 
II A II, 176 

e, 2 
C", 4 

S, 10 
det, 99 
dim, 19 

8-^, 123 
c,43 

exp A, 186 
sr, 5 


e,2 

<R, 2 

(R(^), 88 
r(A), 182 
Re, 126 
P(A), 90 
(R", 5 

S®, 26 
sgn, 47 
St, 43 
sup, 176 

tr, 105 

■O', 20 
V", 24 
V/91t, 34 
■0+ 41, 151 
V* 131 


3C + 3C, 18 

Im, 126 
inf, 176 

911®®, 27 

91(A), 88 
v(A), 90 

0, 5 

(P, 4 
ir-‘,43 
Pat, 146 
<P,,5 
(P,a),44 


lx, y], 21 
(a;, y), 28 
sc', 23 
* + 911, 33 
(X, y), 119, 121 

II * II, 118, 121 


Z,n,2 


C, 17 

n, 17 
©, 28 
®, 40, 95 
35 
140 

{•••: ...}, 176 





